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Preface 



Static analysis is a research area aimed at developing principles and tools for 
verification, certification, semantics-based manipulation, and high-performance 
implementation of programming languages and systems. The series of Static 
Analysis symposia has served as the primary venue for presentation and discus- 
sion of theoretical, practical, and application advances in the area. 

This volume contains the papers accepted for presentation at the 11th Inter- 
national Static Analysis Symposium (SAS 2004), which was held August 26-28 in 
Verona, Italy. In response to the call for papers, 63 contributions were submitted 
from 20 different countries. Following on-line discussions, the Program Commit- 
tee met in Verona on May 06, and selected 23 papers, basing this choice on 
their scientific quality, originality, and relevance to the symposium. Each paper 
was reviewed by at least 3 PC members or external referees. In addition to the 
contributed papers, this volume includes contributions by outstanding invited 
speakers: a full invited paper by Thomas Henzinger (University of Califorina 
at Berkeley), and abstracts of the talks given by the other invited speakers, 
Sheila Mcllraith (University of Toronto), Ehud Shapiro (Weizmann Institute) 
and Yannis Smaragdakis (Georgia Institute of Technology). 

On the behalf of the Program Committee, the Program Chair would like to 
thank all the authors who submitted papers and all external referees for their 
careful work in the reviewing process. The Program Chair would like to thank 
in particular Samir Genaim, who did an invaluable, excellent job in organizing 
the Program Committee meeting and the structure of this volume. We would 
like to express our gratitude to the Dipartimento di Informatica and to the 
Universitd degli Studi di Verona, in particular to Prof. Elio Mosele (president of 
the university), who handled the logistical arrangements and provided financial 
support for organizing this event. 

SAS 2004 was held concurrently with LOPSTR 2004, International Sym- 
posium on Logic-Based Program Synthesis and Transformation; PEPM 2004, 
ACM SIGPLAN Symposium on Partial Evaluation and Program Manipulation; 
and PPDP 2004, ACM SIGPLAN International Conference on Principles and 
Practice of Declarative Programming. There were also several workshops in the 
area of programming languages. We would like to thank Sandro Etalle (LOPSTR 
PC Chair), Nevin Heintze and Peter Sestoft (PEPM PC Chairs), Eugenio Moggi 
(PPDP General Chair), Fausto Spoto (Organizing Chair), and David Warren 
(PPDP PC Chair) for their help in the organization aspects. Special thanks to 
all the members of the Organizing Committee who worked with enthusiasm in 
order to make this event possible and to ENDES, specifically to Anna Chiara 
Caputo, for the great job she did in the local organization. 



Verona, June 2004 



Roberto Giacobazzi 




Organization 



Program Committee 



Thomas Ball 

Radhia Cousot 

Roberto Giacobazzi (Chair) 

Chris Hankin 

Thomas Jensen 

Jens Knoop 

Giorgio Levi 

Laurent Mauborgne 

Andreas Podelski 

German Puebla 

Ganesan Ramalingam 

Francesco Ranzato 

Martin Rinard 

Andrei Sabelfeld 

Mary Lou Soffa 

Harald Spndergaard 

Reinhard Wilhelm 



Microsoft, USA 
Ecole Poly technique, France 
Universita di Verona, Italy 
Imperial College London, UK 
IRISA, France 

Technische Universitat Wien, Austria 
Universita di Pisa, Italy 
Ecole Normale Superieure, France 
Max-Planck-Institut fiir Informatik, Germany 
Technical University of Madrid, Spain 
IBM, USA 

Universita di Padova, Italy 
Massachusetts Institute of Technology, USA 
Chalmers University of Technology, Sweden 
University of Pittsburgh, USA 
University of Melbourne, Australia 
Universitat des Saarlandes, Germany 



Steering Committee 



Patrick Cousot 
Gilberto File 
David Schmidt 



Ecole Normale Superieure, France 
Universita di Padova, Italy 
Kansas State University, USA 



Organizing Committee 

Mila Dalla Preda 
Samir Genaim 
Isabella Mastroeni 
Massimo Merro 
Giovanni Scardoni 
Fausto Spoto 
Damiano Zanardini 




Organization VII 



Referees 

Elvira Albert 
M. Anton Ertl 
Roberto Bagnara 
Roberto Barbuti 
Joerg Bauer 
Michele Bugliesi 
V.C. Sreedhar 
Paul Caspi 
Patrick Cousot 
Alexandru D. Salcianu 
Mila Dalla Preda 
Ferruccio Damiani 
Bjorn De Sutter 
Bjoern Decker 
Pierpaolo Degano 
Nurit Dor 
Manuel Fahndrich 
Jerome Feret 
Gilberto File 
Steve Fink 
Bernd Finkbeiner 
Cormac Flanagan 
Maurizio Gabbrielli 
Samir Genaim 
Roberta Gori 
David Grove 
Daniel Hedin 
Dan Hirsch 
Gharles Hymans 
Daniel Kaestner 
John Kodumal 
Andreas Krall 
Viktor Kuncak 
Kung-Kiu Lau 
Francesca Levi 
Donglin Liang 
Andrea Maggiolo Schettini 
Isabella Mastroeni 
Ken McMillan 



Massimo Merro 
Antoine Mine 
Anders Moller 
David Monniaux 
Garlo Montangero 
Damen Msse 
Markus Miiller-Olm 
Ulrich Neumerkel 
Jens Palsberg 
Filippo Portera 
Franz Puntigam 
Xavier Rival 

Enric Rodrfguez-Garbonell 
Sabina Rossi 
Salvatore Ruggieri 
Audrey Rybalchenko 
Rene Rydhof Hansen 
Oliver Riithing 
Mooly Sagiv 
Giovanni Scardoni 
Dave Schmidt 
Bernhard Scholz 
Markus Schordan 
Francesca Scozzari 
Glara Segura 
Helmut Seidl 
Alexander Serebrenik 
Vincent Simonet 
Fabio Somenzi 
Fausto Spoto 
Zhendong Su 
Francesco Tapparo 
Ashish Tiwari 
Thomas Wies 
Sebastian Winkel 
Zhe Yang 
Enea Zaffanella 
Damiano Zanardini 
Andreas Zeller 




Table of Contents 



Invited Talks 



Injecting Life with Computers 1 

Ehud Shapiro 

The Blast Query Language for Software Verification 2 

Dirk Beyer, Adam J. Chlipala, Thomas A. Henzinger, Ranjit Jhala, 
and Rupak Majumdar 

Program Generators and the Tools to Make Them 19 

Yannis Smaragdakis 

Towards Declarative Programming for Web Services 21 

Sheila Mcllraith 



Program and System Verification 



Closed and Logical Relations for Over- and Under- Approximation 

of Powersets 22 

David A. Schmidt 

Completeness Refinement in Abstract Symbolic Trajectory Evaluation .... 38 
Mila Dalla Freda 

Constraint-Based Linear-Relations Analysis 53 

Sriram Sankaranarayanan, Benny B. Sipma, and Zohar Manna 

Spatial Analysis of BioAmbients 69 

Hanne Riis Nielson, Flemming Nielson, and Henrik Pilegaard 



Security and Safety 

Modular and Constraint-Based Information Flow Inference 



for an Object-Oriented Language 84 

Qi Sun, Anindya Banerjee, and David A. Naumann 

Information Flow Analysis in Logical Form 100 

Torben Amtoft and Anindya Banerjee 

Type Inference Against Races 116 

Cormac Flanagan and Stephen N. Freund 




X 



Table of Contents 



Pointer Analysis 

Pointer-Range Analysis 133 

Suan Hsi Yong and Susan Horwitz 

A Scalable Nonuniform Pointer Analysis for Embedded Programs 149 

Arnaud Venet 

Bottom-Up and Top-Down Context-Sensitive 

Summary-Based Pointer Analysis 165 

Erik M. Nystrom, Hong-Seok Kim, and Wen-mei W. Hwu 

Abstract Interpretation and Algorithms 

Abstract Interpretation of Combinational Asynchronous Circuits 181 

Sarah Thompson and Alan Mycroft 

Static Analysis of Gated Data Dependence Graphs 197 

Charles Hymans and Eben Upton 

A Polynomial-Time Algorithm for Global Value Numbering 212 

Sumit Gulwani and George G. Necula 

Shape Analysis 

Quantitative Shape Analysis 228 

Radu Rugina 

A Relational Approach to Interprocedural Shape Analysis 246 

Bertrand Jeannet, Alexey Loginov, Thomas Reps, and Mooly Sagiv 

Partially Disjunctive Heap Abstraction 265 

Roman Manevich, Mooly Sagiv, Ganesan Ramalingam, 
and John Eield 

Abstract Domain and Data Strnctures 

An Abstract Interpretation Approach for Automatic Generation 

of Polynomial Invariants 280 

Enric Rodriguez- Carhonell and Deepak Kapur 

Approximating the Algebraic Relational Semantics 

of Imperative Programs 296 

Michael A. Colon 

The Octahedron Abstract Domain 312 

Robert Glariso and Jordi Cortadella 




Table of Contents 



XI 



Path-Sensitive Analysis for Linear Arithmetic 

and Uninterpreted Functions 328 

Sumit Gulwani and George G. Necula 

Shape Analysis and Logic 

On Logics of Aliasing 344 

Marius Bozga, Radu Iosif, and Yassine Lakhnech 

Generalized Records and Spatial Conjunction in Role Logic 361 

Viktor Kuncak and Martin Rinard 

Termination Analysis 

Non-termination Inference for Constraint Logic Programs 377 

Etienne Payet and Fred Mesnard 

Author Index 393 




Injecting Life with Computers 



Ehud Shapiro 

Department of Computer Science and Applied Mathematics and 
Department of Biological Chemistry 
Weizmann Institute of Science, Rehovot 76100, Israel 



Abstract. Although electronic computers are the only “computer 
species” we are accustomed to, the mathematical notion of a pro- 
grammable computer has nothing to do with wires and logic gates. In 
fact, Alan Turing’s notional computer, which marked in 1936 the birth 
of modern computer science and still stands at its heart, has greater 
similarity to natural biomolecular machines such as the ribosome and 
polymerases than to electronic computers. Recently, a new “computer 
species” made of biological molecules has emerged. These simple molec- 
ular computers inspired by the Turing machine, of which a trillion can 
fit into a microliter, do not compete with electronic computers in solving 
complex computational problems; their potential lies elsewhere. Their 
molecular scale and their ability to interact directly with the biochem- 
ical environment in which they operate suggest that in the future they 
may be the basis of a new kind of “smart drugs” : molecular devices 
equipped with the medical knowledge to perform disease diagnosis and 
therapy inside the living body. They would detect and diagnose molecu- 
lar disease symptoms and, when necessary, administer the requisite drug 
molecules to the cell, tissue or organ in which they operate. In the talk 
we review this new research direction and report on preliminary steps 
carried out in our lab towards realizing its vision. 
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The Blast Query Language 
for Software Verification* 



Dirk Beyer^, Adam J. Chlipala^, Thomas A. Henzinger^’^, 
Ranjit Jhala^, and Rupak Majumdar^ 

^ EPFL, Switzerland 
^ University of California, Berkeley 
® University of California, Los Angeles 



Abstract. Blast is an automatic verification tool for checking tem- 
poral safety properties of C programs. Blast is based on lazy predicate 
abstraction driven by interpolation-based predicate discovery. In this pa- 
per, we present the Blast specification language. The language specifies 
program properties at two levels of precision. At the lower level, monitor 
automata are used to specify temporal safety properties of program exe- 
cutions (traces). At the higher level, relational reachability queries over 
program locations are used to combine lower-level trace properties. The 
two-level specification language can be used to break down a verihcation 
task into several independent calls of the model-checking engine. In this 
way, each call to the model checker may have to analyze only part of 
the program, or part of the specihcation, and may thus succeed in a re- 
duction of the number of predicates needed for the analysis. In addition, 
the two-level specification language provides a means for structuring and 
maintaining specifications. 



1 Introduction 

Blast, the Berkeley Lazy Abstraction Software verification Tool, is a fully au- 
tomatic engine for software model checking [11]. Blast uses counterexample- 
guided predicate abstraction refinement to verify temporal safety properties of 
C programs. The tool incrementally constructs an abstract reachability tree 
(ART) whose nodes are labeled with program locations and truth values of 
predicates. If a path that violates the desired safety property is found in the 
ART, but is not a feasible path of the program, then new predicate information 
is added to the ART in order to rule out the spurious error path. The new pred- 
icate information is added on-demand and locally, following the twin paradigms 
of lazy abstraction [11] and interpolation-based predicate discovery [8]. The pro- 
cedure stops when either a genuine error path is found, or the current ART 
represents a proof of program correctness [9]. 

In this paper we present the Blast input language for specifying program- 
verification tasks. The Blast specification language consists of two levels. On 

* This research was supported in part by the NSF grants CCR-0085949, CCR-0234690, 
and ITR-0326577. 
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the lower level, observer automata are defined to monitor the program execution 
and decide whether a safety property is violated. Observer automata can be 
infinite-state and can track the program state, including the values of program 
variables and type-state information associated with individual data objects. On 
the higher level, relational queries over program locations are defined which may 
specify both structural program properties (e.g., the existence of a syntactic path 
between two locations) and semantic program properties (e.g., the existence of 
a feasible path between two locations). The evaluation of a semantic property 
invokes the Blast model-checking engine. A semantic property may also refer 
to an observer automaton, thus combining the two levels of specification. 

Consider the following example. If we change the definition of a variable in a 
program, we have to review all subsequent read accesses to that variable. Using 
static analysis we can find all statements that use the variable, but the resulting 
set is often imprecise (e.g., it may include dead code) because of the path- 
insensitive nature of the analysis. Model checking can avoid this imprecision. 
In addition, using an observer automaton, we can ensure that we compute only 
those statements subsequent to the variable definition which (1) use the variable 
and (2) are not preceded by a redefinition of the variable. The two specification 
levels allow the natural expression of such a query: on the higher level, we specify 
the location-based reachability property between definition and use locations, 
and at the lower level, we specify the desired temporal property by a monitor 
automaton that watches out for redefinitions of the variable. The resulting query 
asks the model checker for the set of definition-use pairs of program locations 
that are connected by feasible paths along which no redefinitions occur. 

The Blast specification language provides a convenient user interface: it 
keeps specifications separate from the program code and makes the model checker 
easier to use for non-experts, as no manual program annotations with specifica- 
tion code (such as assertions) are required. On one hand it is useful to orthog- 
onalize concerns by separating program properties from the source code, and 
keeping them separated during development, in order to make it easier to un- 
derstand and maintain both the program and the specification [13]. On the other 
hand it is preferable for the programmer to specify program properties in a lan- 
guage that is similar to the programming language. We therefore use as much 
as possible C-like syntax in the specification language. The states of observer 
automata are defined using C type and variable declarations, and the automa- 
ton transitions are defined using C code. The query language is an imperative 
scripting language whose expressions specify first-order relational constraints on 
program locations. 

The two-level specification structure provides two further benefits. First, such 
structured specifications are easy to read, compose, and revise. The relational 
query language allows the programmer to treat the program as a database of 
facts, which can be queried by the analysis engine. Moreover, individual parts 
of a composite query can be checked incrementally when the program changes, 
as in regression testing [10]. Second, the high-level query language can be used 
to break down a verification task into several independent model-checking prob- 
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lems, each checking a low-level trace property. Since the number of predicates 
in the ART is the main source of complexity for the model-checking procedure, 
the decomposition of a verification task into several independent subtasks, each 
involving only a part of the program and/or a part of the specification, can 
greatly contribute to the scalability of the verification process [14,17]. A sim- 
ple instance of this occurs if a specification consists of a conjunction of several 
properties that can be model checked independently. The relational query engine 
allows the compact definition of such proof-decomposition strategies. 

For a more instructive example, suppose that we wish to check that there 
is no feasible path from a program location £q to a program location £ 2 , and 
that all syntactic paths from £0 to £2 go through location £ 1 . Then we may 
decompose the verification task by guessing an intermediate predicate pi and 
checking, independently, the following two simpler properties: (1) there is no 
feasible path from £q to £1 such that pi is false at the end of the path (at £ 1 ), 
and (2) there is no feasible path from £i to £2 such that pi is true at the beginning 
of the path (at £ 1 ). Both proof obligations (1) and (2) may be much simpler to 
model check, with fewer predicates needed, than the original verification task. 
Moreover, each of the two proof obligations can be specified as a reachability 
query over locations together with an observer automaton that specifies the final 
(resp. initial) condition pi. 

The paper is organized as follows. In Section 2, we define the (lower-level) 
language for specifying trace properties through observer automata. In Section 3, 
we define the (higher-level) language for specifying location properties through 
relational queries. In Section 4, we give several sample specifications, and in 
Section 5, we briefly describe how the query processing is implemented in Blast. 
Related Work. Automata are often used to specify temporal safety proper- 
ties, because they provide a convenient, succinct notation and are often easier 
to understand than formulas of temporal logic. For example, SLIC [2] specifica- 
tions are used in the SLAM project [1] to generate C code for model checking. 
However, SLIC does not support type-state properties and is limited to the 
specification of interfaces, because it monitors only function calls and returns. 
Metal [7] and MOPS [4] allow more general pattern languages, but the (finite) 
state of the automaton must be explicitly enumerated. Temporal-logic specifica- 
tions, often enriched with syntactic sugar (“patterns”), are used in Bandera [5] 
and Leaver [12]. Type-state verification [16] is an important concept for ensuring 
the reliability of software, but the generally used assumption in this field is to 
consider all paths of a program as feasible. Relational algebra has been applied 
to analyze the structure of large programs [3] and in dynamic analysis [6] . Also 
the decomposition of verification tasks has been recognized as a key issue and 
strategy-definition languages have been proposed [14, 17]. However, the use of a 
relational query language to group queries and decompose proof obligations in 
a model-checking environment seems novel. 
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2 Trace Properties: Observer Automata 

Trace properties are expressed using observer automata. These provide a way 
to specify temporal safety properties of C programs based on syntactic pattern 
matching of C code. An observer automaton consists of a collection of syntactic 
patterns that, when matched against the current execution point of the observed 
program, trigger transitions in the observer. Rather than being limited to a 
finite number of states, the observer may have global variables of any C type, 
and it may track type-state information associated with the program variables. 
The observer transitions are also specified in C syntax; they may read program 
variables and both read and write observer variables. 



2.1 Syntax 

The definition of an observer automaton consists of a set of declarations, each 
defining an observer variable, a type state, an initial condition, a final condition, 
or an event. Figure 1 gives the grammar for specifying observer automata. 



Observer: DeclSeq 

DeclSeq: Declaration I DeclSeq Declaration 

Declaration: ’GLOBAL’ CVarDef 

I ’SHADOW’ CTypeName ’{’ CFieldSeq ’}’ 

I ’INITIAL’ ’{’ CExpression ’}’ 

I ’FINAL’ ’{’ CExpression ’}’ 

I ’EVENT’ 

Temporal 

’PATTERN’ ’{’ Pattern ’}’ 



Temporal : 
Pattern: 
Assertion: 
Action: 



Assertion 

Action 

’}’ 

’BEFORE’ I ’AFTER’ I empty 
ParamCStmt I ParamCStmt ’AT’ LocDesc 
’ASSERT’ ’{’ CExpression ’}’ I empty 
’ACTION’ ’{’ CStatementSeq ’}’ I empty 



Fig. 1. The grammar for the observer specification language. 



Observer Variables. The control state of an observer automaton consists of 
a global part and a per-object part. The global part of the observer state is 
determined by a set of typed, global observer variables. Each observer variable 
may have any C type, and is declared following the keyword GLOBAL, where the 
nonterminal CVarDef stands for any C variable declaration. For example, in the 
case of a specification that restricts the number of calls to a certain function, an 
observer variable numCalls of type int might be used to track the number of 
calls made: “GLOBAL int numCalls;”. 
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Type States. The keyword SHADOW allows the programmer to define additional 
control state of the observer automaton on a per-object basis. For this purpose, 
each distinct C type CTypeName which occurs in the program may have a type 
state declared in the specification. The type-state information is declared by 
the nonterminal CFieldSeq, which stands for any sequence of field definitions 
for a C structure. These fields are then added as type state to every program 
variable of type CTypeName. For example, in the case that the program uses 
a type stack to declare stacks, the following type state may be used to track 
the size of each program variable of type stack: “SHADOW stack {int size ; }” . 
Then, during verification, the type stack is replaced by a new structure type 
with the additional field size. 

Initial and Final Conditions. The initial states of the observer automaton 
are defined by initial conditions. Each initial condition is declared following the 
keyword INITIAL as a boolean expression. The nonterminal CExpression is a 
(side-effect free) C expression that may refer to observer variables, but also to 
global program variables and associated type-state information. This allows us 
to encode a precondition when starting the verification process. We call the 
conjunction of all initial conditions the precondition of the observer automaton. 
If no initial condition is specified, then the precondition is true. Final conditions 
are just like initial conditions, and their conjunction is called the postcondition 
of the observer automaton. The postcondition is used to check the program and 
observer states after any finite trace. 

Events. The transitions of the observer automaton are defined by events. Each 
event observes all program steps and, if a match is obtained, specifies how the 
state of the observer (global variables and type states) changes. The keyword 
EVENT is followed by up to four parts: a temporal qualifier, a pattern, an assertion, 
and an action. Intuitively, at each point in the program execution, the observer 
checks the current program statement (i.e., AST node) being executed against 
the pattern of each event. If more than one pattern matches, then Blast declares 
the specification to be invalid for the given program. If only one pattern matches, 
then the corresponding assertion is checked. If the assertion is violated, then the 
observer rejects the trace; otherwise it executes the corresponding action. The 
Temporal qualifier is either BEFORE or AFTER. It specifies whether the observer 
transition is executed before or after the source-code AST node that matches 
the pattern. If a temporal qualifier is omitted, it is assumed to be BEFORE. 

The keyword PATTERN is followed by a statement that is matched against the 
program source code. The pattern is defined by the nonterminal ParamCStmt, 
followed by an optional program- location descriptor. A pattern is either a C as- 
signment statement or a C function call that involves side-effect free expressions. 
The pattern may refer to variables named $z, for z > 1, which are matched 
against arbitrary C expressions in the program. Each such pattern variable may 
appear at most once in a pattern. There is also a pattern variable named $?, 
which plays the role of a wild-card. It may occur multiple times in a pattern, 
and different occurrences may match the empty string, a C expression, or an ar- 
bitrary number of actual parameters in a function call. The location descriptor 
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LocDesc is either a C label, or a string that concatenates the source file name 
with a line number; e.g., the string “file_19” refers to line number 19 of the 
source file file. If a location descriptor is specified, then the pattern is matched 
only against program locations that match the descriptor. 

The keyword ASSERT is followed by a program invariant that must hold every 
time the corresponding pattern matches. Here, CExpression is a boolean con- 
dition expressed as a C expression that may refer to global program variables, 
observer variables, numbered pattern variables $i that occur in the corresponding 
pattern (which may match local program variables), and type-state information 
associated with any of these. Numbered pattern variables in an assertion refer 
to the expressions with which they are unified by the pattern matching that 
triggers the event. If an assertion is omitted, it is assumed to be always true. 
If during program execution the pattern of an event matches, but the current 
state violates the assertion, then the observer is said to reject the trace. 

The keyword ACTION is followed by a sequence of C statements that are exe- 
cuted every time the corresponding pattern matches. The code in CStatementSeq 
has the following restrictions. First, as in assertions, the only read variables are 
global program variables, observer variables, numbered pattern variables, and 
associated type states. Second, the action code may write only to observer vari- 
ables and to type-state information. In particular, an observer action must not 
change the program state. If an action is omitted, it is assumed to be the empty 
sequence of statements. 

Example 1. [Locking] Consider the informal specification that a program must 
acquire and release locks in strict alternation. The observer automaton defined in 
Figure 2(a) specifies the correct usage of locking functions. An observer variable 
locked is created to track the status of the (only) global lock. Simple events 
match calls to the relevant functions. The event for init initializes the observer 
variable to 0, indicating that the lock is not in use. The other two events ensure 
that the lock is not in use with each call of the function lock, and is in use 
with each call of unlock. When these assertions succeed, the observer variable 
is updated and execution proceeds; when an assertion fails, an error is signaled. 
The wild-cards $?’s match either a variable to which the result of a function 
call is assigned, or the absence of such an assignment, thus making the patterns 
cover all possible calls to the functions lock and unlock. 

Figure 2(b) shows the same specification, but now the program contains 
several locks, and the functions lock and unlock take a lock as a parameter. A 
lock is assumed to be an object of type lock_t. The observer introduces a type 
state locked with each lock of the program, and checks and updates the type 
state whenever one of the functions init, lock, and unlock is called. □ 



2.2 Semantics 

The semantics of a trace property is given by running the observer automaton 
in parallel with the program. The automaton accepts a program trace if along 
the trace, every time an observer event matches, the corresponding assertion is 
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GLOBAL int locked; 



SHADOW lock_t { int locked; } 



EVENT { 

PATTERN { $? = initO; } 
ACTION { locked = 0; > 

} 

EVENT { 

PATTERN { $? = lockO; } 
ASSERT { locked == 0 > 
ACTION { locked = 1; > 

} 

EVENT { 

PATTERN { $? = unlockO; } 
ASSERT { locked == 1 > 
ACTION { locked = 0; > 

} 



EVENT { 

PATTERN -C init($l); } 
ACTION { $l->locked = 0; } 

} 

EVENT { 

PATTERN { lock($l); } 
ASSERT { $l->locked == 0 } 
ACTION { $l->locked = 1; } 

} 

EVENT { 

PATTERN { unlock($l); } 
ASSERT { $l->locked == 1 } 
ACTION { $l->locked = 0; } 



Fig. 2. (a) Specification for a global lock, (b) Specification for several locks. 



true, and moreover, if the trace is finite, then the values of the variables at the 
end of the trace satisfy the postcondition of the observer automaton. Dually, 
the automaton rejects the trace if either some assertion or the postcondition 
fails. We give the semantics of the composition of a program and an observer 
automaton by instrumenting the program source code with C code for the ob- 
server variable, type-state, and event declarations, i.e., the original program is 
transformed into a new program by a sequence of simple steps. This transforma- 
tion is performed statically on the program before starting the model-checking 
engine on the transformed program. 

Syntactic pattern matching on literal C code must deal with code structuring 
issues. Blast performs pattern matching against a simplified subset of C state- 
ments. In our implementation, first a sound transformation from C programs 
to the simplified statement language is performed by CIL [15]. These simplified 
statements consist only of variable assignments and function calls involving side- 
effect free expressions. Second, Blast’s instrumentation of the program with the 
observer is performed on the simplified language. Third, Blast performs model 
checking on the instrumented program, which is represented by a graph whose 
nodes correspond to program locations and whose edges are labeled with se- 
quences of simplified statements [11]. The model checker takes as input also the 
pre- and postconditions of the observer automaton, as described in the next 
section. 

Instrumenting Programs. In the following we define the program instrumen- 
tation with the observer automaton by describing a transformation rule for each 
construct of the observer specification. 

Observer Variables. Declarations of observer variables are inserted as global dec- 
larations in the C program. 
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Type State. The type-state fields declared by the observer automaton are in- 
serted into the declarations section of the C program by replacing the original 
declarations of the corresponding types. The actual transformation depends on 
the “shadowed” type. If the shadowed type is abstract, then the type itself is 
replaced. In this case, the fields of the original type cannot be analyzed, because 
their definition is not available. If the shadowed type is not abstract, then the 
original type becomes one field of the new type, with the other fields holding the 
type-state information. All type accesses in the program are modified accord- 
ingly. For the example in Figure 2(b), first assume that lock_t is an abstract 
type for the locking data structure. Then the type-state declaration 

SHADOW lock_t { int locked; } 

is transformed and inserted as follows in the declarations section of the program: 

struct shadowO { int locked; }; 

typedef struct shadowO *lock_t; 

If, on the other hand, the type lock_t is defined as 

struct lock_t_struct { int lock_info; } 
typedef struct lock_t_struct *lock_t; 

then the type name is changed to lock_t_orig and the type-state declaration is 
transformed and inserted as follows: 

struct shadowO { lock_t_orig shadowed; int locked; }; 

typedef struct shadowO *lock_t; 

Additionally, in this case, for every instance mylock of type lock_t, each occur- 
rence of mylock->lock_inf o is replaced by mylock->shadowed->lockJ.nf o. 

Events. For every event declaration of the observer automaton an if-statement 
is generated. The condition of that if-statement is a copy of the assertion, where 
the pattern variables $i are replaced by the matching C expressions. The then- 
branch contains a copy of the action code, again with the place holders substi- 
tuted accordingly. The else-branch contains a transition to the rejecting state of 
the automaton. Then the original program is traversed to find every matching 
statement for the pattern of the event. The pattern is matched if the place hold- 
ers ($z and $?) in the pattern can be replaced by code fragments such that the 
pattern becomes identical to the examined statement. If two or more patterns 
match the same statement, then Blast stops and signals that the specification 
is invalid (ambiguous) for the given program. As specified by the temporal qual- 
ifier BEFORE or AFTER, the generated if-statement is inserted before or after each 
matching program statement. Consider, for example, the second event declara- 
tion from Figure 2(a). For this event, every occurrence of the code fragment 
lockO; matches the pattern, whether or not the return value is assigned to 
a variable (because of the wild-card $? on the left-hand side of the pattern). 
The instrumentation adds the following code before every call to lock in the 
program: 
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if (locked == 0) { 

locked = 1; 

} else { 

{ reject = 1; } // transition to rejecting state 

> 

Note that the rejecting state of the observer automaton is modeled by the im- 
plicitly defined observer variable reject. This variable must not occur in the 

program nor in the observer declaration. 

Observer Semantics. A state of a program P is a pair ((', v) consisting of a 
program location £ and a memory valuation v. Let £ and £' be two program 
locations, and let p and p' be two predicates over the program variables. The 
pair {£' ,p') is reachable in P from the pair (£,p) if there exists an executable 
state sequence (finite trace) of P from a state (£,v) to a state {£',v'), for some 
memory valuation v that satisfies p and some valuation u' that satisfies p'. 

We can now define the semantics of an observer automaton A over a program 
P in terms of the traces of the instrumented program Pa - Let pre be the predicate 
prcA A (—reject = 0), where pre a is the precondition of A, and let post be the 
predicate post a A (—reject = 0), where post a is the postcondition of A. The 
location £' is A-accept-reachahle in P from £ if (£',post) is reachable in Pa from 
(£, pre). The location £' is A-reject-reachable in P from £ if {£', ^post) is reachable 
in Pa from {£,pre). Note that both accept- and reject-reachability postulate the 
existence of a feasible path in P from £ to £'] the difference depends only on 
whether the observer automaton accepts or rejects. In particular, it may be that 
£' is both A-accept- and A-reject-reachable in P from £. 

3 Location Properties: Relational Qneries 

Every observer automaton encodes a trace property. At a higher level, observer 
automata can be combined by relational queries. The queries operate on pro- 
gram locations and specify properties using sets and relations over program lo- 
cations. The query language is an imperative scripting language that extends the 
predicate calculus: it provides first-order relational expressions (but no function 
symbols) as well as statements for variable assignment and control flow. 

3.1 Syntax 

A simple query is a sequence of statements, where each statement is either an 
assignment or a print statement. There are three types of variables: string, prop- 
erty, and relation variables. A string variable may express a program location, a 
function name, or a code fragment. The property variables range over observer 
automata (i.e., trace properties), as defined in the previous section. The relation 
variables range over sets of tuples of strings. There is no need to declare the 
type of a variable; it is determined by the value of the first assignment to the 
variable. For the convenient and structured expression of more complex queries, 
the language also has constructs (IF, WHILE, FOR) for the conditional execution 
and iteration of statements. 




The Blast Query Language for Software Verification 



11 



Statement: PropVar ’ [’ Observer ’]’ 

I RelVar ’(’ StrExp StrExp ’)’ BoolExp 

I ’PRINT’ StrExp ’;’ I ’PRINT’ BoolExp ’;’ 

BoolExp: RelVar ’(’ StrExp ’,’ StrExp ’)’ 

I ’TRUE’ ’(’ StrVar ’)’ I ’FALSE’ ’(’ StrVar ’)’ 

I BoolExp ’&’ BoolExp // conjunction 

I BoolExp ’ I ’ BoolExp // disjunction 

I ’ ! ’ BoolExp // negation 

I ’EXISTS’ ’(’ StrVar ’,’ BoolExp ’)’ 

I ’MATCH’ ’(’ RegExp ’,’ StrVar ’)’ 

I ’A-REACH’ ’(’ BoolExp ’,’ BoolExp ’,’ PropVar ’)’ 

I ’R-REACH’ ’(’ BoolExp ’,’ BoolExp ’,’ PropVar ’)’ 

StrExp: StrLit I StrVar 



Fig. 3. Partial syntax of the query language. 



The expression language permits first-order quantification over string vari- 
ables. In the right-hand side expression of an assignment, every variable must 
either be a relation variable and have been previously assigned a value, or it 
must be a string variable that is quantified or occurs free. The implemented 
query language allows relations of arbitrary arity, but for simplicity, let us re- 
strict this discussion to binary relation variables. Also, let us write x and y for 
the values of the string variables x and y, and R for the set of pairs denoted 
by the binary relation variable R. Then the boolean expression R(x,y) evaluates 
to true iff (x, y) G R. To assign a new value to the relation variable R we write 
“R(x,y) := e” short for “for all x,y let R(x,y) ;= e,” where e is a boolean 
expression that may contain free occurrences of x and y. 

Each print statement has as argument a boolean expression, with possibly 
some free occurrences of string variables. The result is a print-out of all value 
assignments to the free variables which make the expression true. For example, 
“PRINT R(x,y)” outputs the header (x, y) followed by all pairs {x,y) of strings 
such that {x, y) G R. 

The grammar for queries without control-flow constructs is shown in Figure 3. 
The nonterminals StrVar, PropVar, and RelVar refer to any C identifier; StrLit 
is a string literal; Observer is a specification of an observer automaton, as defined 
in Section 2; and RegExp is a Unix regular expression. 

3.2 Semantics 

The first-order constructs (conjunction, disjunction, negation, existential quan- 
tification) as well as the imperative constructs (assignments, control flow, out- 
put) have the usual meaning. The boolean expression MATCH (e,x) evaluates to 
true iff the value of the string variable x matches the regular expression e. 
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Reachability Queries. Consider an input program P, and a property variable 
A denoting an observer automaton A. Let source and target be two boolean 
expressions each with a single free string variable, say loc_s and loc_t. The 
boolean expression A-REACH(sortrce , target, A) evaluates to true for given values 
f for loc_s and I' for loc_t iff source and target evaluate to true for ^ and P , 
respectively, and P is A-accept-reachable in P from £. The boolean expression 
R-REACH (sortrce, target. A) evaluates to true for given values £ for loc_s and P 
for loc_t iff source and target evaluate to true for £ and P , respectively, and P 
is A-reject-reachable in P from £. These relations are evaluated by invoking the 
Blast model checker on the instrumented program. 

Syntactic Sugar. Using the above primitives, we can define some other use- 
ful queries as follows. The property variable Empty denotes the empty observer 
automaton, which has no events and pre- and postconditions that are always 
true. The macro REACH (sowrce, target) is short-hand for A-REACH (sortrce, tar- 
get, Empty); it evaluates to true for given values £ for loc_s and P for loc_t iff 
both source and target evaluate to true and there is a feasible path in P from £ 
to P . The macro SAFE(sortrce, A) is short-hand for 

source & ! EXISTS (loc_t, R-REACH (sortrce, TRUE(loc_t), A)) 

This boolean expression evaluates to true for a given value £ for loc_s iff source 
evaluates to true and there is no feasible path in P from £ which makes the 
observer A enter a rejecting state. 

Syntactic Relations. There are a number of useful predefined syntactic relation 
variables. These are restricted to relations that can be extracted from the AST of 
the program. The following relations are automatically initialized after starting 
the query interpreter to access information about the syntactic structure of the 
program: 

— LDC_FUNC(loc,fname) evaluates to true iff the program location loc is con- 
tained in the body of the C function fname. 

— LOC_FUNC_INIT (loc, fname) evaluates to true iff the program location loc 
is the initial location of the C function fname. 

— LOC_LABEL(loc,lname) evaluates to true iff the location loc contains the 
C label Iname. 

— LOC_LHSVAR(loc,vname) evaluates to true iff the location loc contains the 
variable vname on the left-hand side of an assignment. 

— LOC_RHSVAR(loc, vname) evaluates to true iff the location loc contains the 
variable vname on the right-hand side of an assignment. 

— LOC_TEXT(loc , sourcecode) evaluates to true iff the C code at the location 
loc is identical to sourcecode. 

— CALL (f name, funcname_callee) evaluates to true iff the function fname 
(syntactically) calls the function funcname_callee. 



Other relations that reflect the syntactic structure of the program can be added 
as needed. 
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Example 2. [Reachability Analysis] The following query computes all reach- 
able lines that contain the program code abort: 

source (loc) := LOC_FUNC_INIT(loc, "main") ; 
target (loc) := 

EXISTSCtext, L0C_TEXT (loc, text) & MATCH ("abort" .text) ) ; 
result (loci , loc2) := REACH(source(locl) ,target(loc2)) ; 

PRINT result (loci ,_) ; 

The first statement of the query assigns a set of program locations to the relation 
variable source. The set contains all locations that are contained in the body 
of function main. The second statement constructs the set of program locations 
that contain the code abort. The third statement computes a set of pairs of 
program locations. A pair of locations is contained in the set result iff there is 
an executable program trace from some location in source to some location in 
target. The last statement prints out a list of all source locations with a feasible 
path to an abort statement. The symbol is used as an abbreviation for an 
existentially quantified string variable which is not used elsewhere. □ 

Example 3. [Dead- Code Analysis] The following query computes the set of 
locations of the function main that are not reachable by any program execution 
(the “dead” locations): 

live (loci , loc2) ;= 

REACH (LDC_FUNC_1N1T (loci , "main" ) , L0C_FUNC (loc2 , _) ) ; 
reached(loc) := live(_,loc); 

PRINT "Following locations within ’main’ are not reachable;"; 
PRINT Ireacheddoc) & L0C_FUNC (loc, "main") ; 

We first compute the set of all program locations that are reachable from the 
initial location of the function main. We print the complement of this set, which 
represents dead code, restricted to the set of locations of the function main. □ 

Both of the above examples are simple reachability queries. Examples of more 
advanced queries, which combine location and trace properties, are presented in 
the next section. 

4 Examples 

Impact Analysis. Consider the C program displayed in Figure 4(a). At the 
label START, the variable j is assigned a value. We wish to find the locations that 
are affected by this assignment, i.e., the reachable locations that use the variable 
j before it is redefined. Consider the observer automaton A shown in Figure 4(b). 
Along a trace, every assignment to j increments the variable gDefined. Thus, 
gDef ined is equal to 1 only when there has been exactly one definition of j . The 
final condition ensures that along a finite trace, no redefinition of j has occurred. 
Hence, the desired set of locations is computed by the following query: 
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1 int j ; 

2 void f (int j ) O ; 

3 int compute 0 { 

4 int i ; 

5 START: j = 1; 

6 i = 1 ; 

7 if (i==0) { 

8 f(j); // affected if i==0 

9 } 

10 if (i==0) { 

11 j = 2; 

12 } 

13 if (j==2) { // affected if i==l 

14 f(j); // not affected if i==0 

15 } 

16 return 0 ; 

17 } 

18 int mainO { 

19 compute 0; 

20 return 0 ; 

21 } 



GLOBAL int gDefined ; 

INITIAL -C gDefined == 0 } 
EVENT { 

PATTERN { j = $1; } 
ACTION { gDefined ++ ; } 

} 

FINAL { gDefined == 1 > 



Fig. 4. (a) C program, (b) Impact automaton A. 



GLOBAL int E; 

INITIAL (__E == 0) ; EVENT { 

EVENT { PATTERN { $? = system ($?) ; } 

PATTERN { $? = seteuid($l); } ASSERT { __E != 0 > 

ACTION { __E = $1; > } 



Fig. 5. (a) Effective UID automaton, (b) Syscall privilege automaton. 



af f ectedCll , 12) := 

A-REACH(L0C_LABEL(11, "START") , L0C_RHSVAR(12, " j ") , A); 

PRINT affected(_,12) ; 

For our example, Blast reports that the definition of the variable j at line 5 
has impact on line 13. It has no impact on line 8, as that line is not reachable 
because of line 6. On the other hand, if line 6 is changed to “i=0 ; ” , then line 8 is 
reachable and affected. Now, line 11 is reachable and therefore a redefinition of j 
takes place. Thus, line 13 is not affected. To compute the effect of each definition 
of j, we can change the first argument of A-REACH to LDC_LHSVAR(11 , " j ") . 

Security Analysis. Consider a simplified specification for the manipulation of 
privileges in setuid programs [4]. Unix processes can execute at several privi- 
lege levels; higher privilege levels may be required to access restricted system 
resources. Privilege levels are based on user id’s. The seteuid system call is 
used to set the effective user id of a process, and hence its privilege level. The 
effective user id 0 (or root) allows a process full privileges to access all system 
resources. The system call runs a program as a new process with the privilege 
level of the current effective user id. The observer automaton B in Figure 5(a) 
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tracks the status of the effective user id by maintaining an observer variable 
__E, which denotes the current effective user id. Initially, __E is set to 0. The $1 
pattern variable in the seteuid pattern matches the actual parameter. Every 
time seteuid is called, the value of __E is updated to be equal to the parameter 
passed to seteuid in the program. 

Suppose we want to check that the function system is never called while 
holding root privileges. This can be done by adding the event in Figure 5(b) 
to the automaton B (call the resulting automaton B ’ ) and computing the query 
“SAFE(LOC_FUNC_INIT(loc , "main"), B’)”. The $? wild-card in the system 
pattern is used to match all remaining parameters. As long as the assertion is 
satisfied, the observer does nothing, because the action is empty; however, if the 
assertion is not satisfied, the trace is rejected. 

Now suppose we want to know which locations of the program can be run 

with root privileges, i.e., with E = 0. This can be accomplished by the following 

query: 

target(loc) := L0C_FUNC(loc,_) ; 

rootPrivClocl , loc2) := 

A-REACH(LOC_FUNC_INIT(locl, "main"), target (loc2) , B"); 

PRINT rootPriv(_ , loc) ; 

where automaton B" is automaton B with the final condition “FINAL (__E==0) . 
Decomposing Verification Tasks. We now show how the relational query 
language and observer automata can be combined to decompose the verification 
process [14]. Consider an event e of an observer automaton A with the postcon- 
dition postj^. We say that e extends A if (1) the assertion of e is always true, and 
(2) the action of e writes only to variables not read by A. Let A.e be the observer 
automaton obtained by adding to A (1) a fresh observer variable x_e, (2) the 
initial condition x_e == 0, and (3) the code x_e = 1 as the first instruction in 
the body of the action of e. Define RSplit.A.e to be the pair of observer automata 
(A.ejf, A.e“) which are A.e with the postconditions changed to x_e = 1 postj^ 
and x_e yf 1 post respectively. Define ASplit.A.e to be the pair of automata 
(A.e^ , A.e~) which are A.e with the postconditions changed to x_e = 1 A postj^ 
and Xe yf 1 A postj^, respectively. 

Lemma 1. Let P he a program, let A be an observer automaton, and let e he an 
event that extends A. Let (A1,A2) = RSplit.A.e (resp. (Ai,A 2 ) = ASplit.A.e/ 
A location I' is A-reject-reachable (resp. A-accept-reachable) in P from I iff 
either I' is Ai-reject-reachable (resp. Ai-accept-reachahle) in P from £, or £' is 
A 2 -reject-reachable (resp. A 2 -accept-reachable) in P from £. 

The split partitions the program traces into those where the event e occurs 
and those where it doesn’t occur. We can now extend our query language to 
allow for boolean macro expressions of the following kind: b SPLIT e, where b is 
a boolean expression and e is an event. This macro stands for b with each occur- 
rence of a subexpression of the form R-REACH(-, ■, A) , where e extends A, re- 
placed by R-REACH(-, •, Ai) I R-REACH(-, •, A 2 ) , where (^ 1 ,^ 2 ) = RSplit.A.e, 
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and each occurrence of a subexpression of the form A-REACH( - , • , A) replaced 
with A-REACH(-, •, A\) \ A-REACH(-, •, A2) , where (Ai,A2) = ASplit.yl.e. By 
Lemma 1, the boolean expression b SPLIT e is equivalent to b. With a judicious 
choice of events, we can therefore break down the evaluation of a complex query 
into multiple simpler queries. 

We illustrate this using the example of a Windows device driver for a floppy 
disk^, and concentrate the Plug and Play (PNP) manager, which communi- 
cates requests to devices via I/O request packets. For example, the request 
IRP_MN_START_DEVICE instructs the driver to do all necessary hardware and soft- 
ware initialization so that the device can function. Figure 6 shows the code for 
the PNP manager. The code does some set-up work and then branches to handle 
each PNP request. We wish to verify a property of the driver that specifies the 
way I/O request packets must be handled^. Let A be the observer automaton for 
the property. 

1 NTSTATUS FloppyPnpC IN PDEVICE_OBJECT DeviceObject , IN PIRP Irp) { 

2 ... 

3 PI0_STACK_L0CATI0N irpSp = IoGetCurrentIrpStackLocation( Irp ) ; 



4 



5 


switch 


( irpSp->MinorFunction ) { 


6 


L_l: 


case IRP_MN_START_DEVICE: 


7 




ntStatus = FloppyStart (DeviceObject , Irp) j 


8 




break; 


9 


L_2: 


case IRP_MN_QUERY_STOP_DEVICE: 


10 






11 




break; 


12 




// several other cases 


13 


L_k: 


default : 


14 






15 


} 




16 






17 


return 


ntStatus ; 



18 } 



Fig. 6. A floppy driver. 



Intuitively, the verification can be broken into each kind of request sent by 
the PNP manager, that is, if we can prove the absence of error for each case in 
the switch statement, we have proved the program correct with respect to the 
property. Let e_l, . . . , e_k be the events that denote the reaching of the program 
labels L_l, . . . , L_k, which correspond to each case in the switch statement. The 
following relational query encodes the proof decomposition: 

( ... (SAFE(LOC_FUNC_INIT(loc, "FloppyPnp") , A) SPLIT e_l) 

. . . SPLIT e_k) 

This query breaks the safety property specified by A into several simpler queries, 
one for each combination of possible branches of the switch statement. While this 

^ Available with the Microsoft Windows DDK. 

^ Personal communication with T. Ball and S. Rajamani. 
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Commands 



Fig. 7 . Architecture of the verification toolkit. 



results in exponentially many subqueries, all but k of these subqueries (where 
more than one, or none of the events happens) are evaluated very efficiently by 
exploiting the syntactic control-flow structure of the program, by noting that a 
violation of the subproperty is syntactically impossible. The remaining k cases, 
which are syntactically possible, are then model checked independently, leading 
to a more efficient check, because independent abstractions can be maintained. 

5 Tool Architecture 

The overall architecture of the implementation is shown in Figure 7. CIL [15] 
parses the input program and produces the AST used by the program trans- 
former. The query parser parses the specification file and extracts program- 
transformation rules to later guide the program instrumentation. It also prepares 
the data structures for the relational computations. The program transformer 
takes as input the representation of the original program and the transforma- 
tion rules. When required by the query interpreter, it takes one particular set 
of transformation rules at a time (corresponding to one observer automaton) 
and performs the instrumentation. The result is the AST of the instrumented 
code. The query interpreter is the central controlling unit in this architecture. It 
dispatches the current query from the query queue to the relational-algebra en- 
gine for execution. If the next statement is a REACH expression, it first requests 
the instrumented version of the program from the transformer, then requests 
the relational-manipulation engine to transfer the input relations to the model- 
checking engine, and then starts the model checker Blast. When the model 
checking is completed, the relational-manipulation engine stores the results of 
the query and gives the control back to the query interpreter. 
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The relational- algebra engine is a calculator for relational expressions. It 
uses a highly optimized BDD-based library for querying and manipulating re- 
lations [3]. This library deals with relations on the level of predicate calculus. 
There is no need to encode variables and values to bit representations, because 
the library provides automatic value encoding and efficient high-level operations 
to abstract from the core BDD algorithms. 

References 

1. T. Ball and S.K. Rajamani. The SLAM project: Debugging system software via 
static analysis. In Proc. POPL, pages 1-3. ACM, 2002. 

2. T. Ball and S.K. Rajamani. SLIC: A specification language for interface checking 
(of C). Technical Report MSR-TR-2001-21, Microsoft Research, 2002. 

3. D. Beyer, A. Noack, and C. Lewerentz. Simple and efficient relational querying of 
software structures. In Proc. WORE, pages 216-225. IEEE, 2003. 

4. H. Chen and D. Wagner. MOPS: An infrastructure for examining security proper- 
ties of software. In Proc. CCS, pages 235-244. ACM, 2002. 

5. J.C. Corbett, M.B. Dwyer, J. Hatcliff, and Robby. A language framework for ex- 
pressing checkable properties of dynamic software. In Proc. SPIN, LNCS 1885, 
pages 205-223. Springer, 2000. 

6. S. Goldsmith, R. O’Callahan, and A. Aiken. Lightweight instrumentation from 
relational queries on program traces. Technical Report CSD-04-1315, UC Berkeley, 
2004. 

7. S. Hallem, B. Chelf, Y. Xie, and D. Engler. A system and language for building 
system-specific static analyses. In Proc. PLDI, pages 69-82. ACM, 2002. 

8. T.A. Henzinger, R. Jhala, R. Majumdar, and K.L. McMillan. Abstractions from 
proofs. In Proc. POPL, pages 232-244. ACM, 2004. 

9. T.A. Henzinger, R. Jhala, R. Majumdar, G.C. Necula, G. Sutre, and W. Weimer. 
Temporal-safety proofs for systems code. In Proc. CAV, LNCS 2404, pages 526- 
538. Springer, 2002. 

10. T.A. Henzinger, R. Jhala, R. Majumdar, and M.A.A. Sanvido. Extreme model 
checking. In International Symposium on Verification: Theory and Practice, 
LNCS 2772, pages 332-358. Springer, 2003. 

11. T.A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Proc. 
POPL, pages 58-70. AGM, 2002. 

12. G.J. Holzmann. Logic verification of ANSI-C code with SPIN. In Proc. SPIN, 
LNGS 1885, pages 131-147. Springer, 2000. 

13. G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C.V. Lopes, J.-M. Loingtier, 
and J. Irwin. Aspect-oriented programming. In Proc. ECOOP, LNCS 1241, pages 
220-242. Springer, 1997. 

14. K.L. McMillan. A methodology for hardware verification using compositional 
model checking. Science of Computer Programming, 37(l-3):279-309, 2000. 

15. G.C. Necula, S. McPeak, S.P. Rahul, and W. Weimer. CIL: Intermediate language 
and tools for analysis and transformation of C programs. In Proc. CC, LNCS 2304, 
pages 213-228. Springer, 2002. 

16. R.E. Strom and S. Yemini. Typestate: A programming language concept for en- 
hancing software reliability. IEEE Trans. Software Engineering, 12(1):157-171, 
1986. 

17. E. Yahav and G. Ramalingam. Verifying safety properties using separation and 
heterogeneous abstractions. In Proe. PLDI, pages 25-34. AGM, 2004. 




Program Generators and the Tools 
to Make Them 



Yannis Smaragdakis 



College of Computing 
Georgia Institute of Technology 
Atlanta, GA 30332, USA 
yaimisScc . gatech . edu 



Abstract. Program generation is among the most promising techniques 
in the effort to increase the automation of programming tasks. In this 
talk, we discuss the potential impact and research value of program gen- 
eration, we give examples of our research in the area, and we outline a 
future work direction that we consider most interesting. 

Specihcally, we first discuss why program generators have significant ap- 
plied potential. We believe that program generators can be made easy- 
to-implement so that they are competitive with traditional software li- 
braries in many software domains. Compared to a common library, a 
generator implementing a domain-specific language can offer more con- 
cise syntax, better static error checking, and better performance through 
cross-operation optimizations. 

Despite the significant applied value of generators, however, we argue 
that meta-programming tools (i.e., language tools for writing program 
generators) may be of greater value as a research topic. The reason has to 
do with the domain-specificity of generators. The value of a program gen- 
erator is often tied so closely to a software domain that there is little gen- 
eral and reusable knowledge to transmit to other generator researchers. 
We discuss meta-programming tools as an area with both interesting 
conceptual problems and great value. A good meta-programming infras- 
tructure can simplify the creation of generators to make them an effective 
solution for many more domains. 

We illustrate our views on generators and meta-programming tools with 
two artifacts from our latest work: the Meta-AspectJ meta-programming 
language [6] and the GOTEGH generator [5] . Meta- Aspect J enables gen- 
erating Java and AspectJ programs using code templates, i.e., quote and 
unquote operators. Meta-AspectJ has two interesting elements. First, 
we believe that using the AspectJ language as a back-end simplifies the 
task of writing a generator. The GOTEGH generator uses this technique 
to adapt a Java program for server side execution in a J2EE application 
server. Second, Meta-AspectJ is a technically mature meta-programming 
tool - in many respects the most advanced meta-programming tool for 
Java. For instance, Meta-AspectJ reduces the need to deal with low 
level syntactic types for quoted entities (e.g., “expression”, “statement”, 
“identifier”, etc.) through type inference and a context-sensitive parsing 
algorithm. 
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Finally, we examine the problem of statically determining the safety of 
a generator and present its intricacies. We limit our focus to one partic- 
ular kind of guarantee for generated code: ensuring that the generated 
program is free of compile-time errors, such as type errors, references to 
undefined variables, etc. We argue that it is the responsibility of a good 
meta-programming tool to ensure that the generators written in it will al- 
ways produce legal programs. Nevertheless, if we do not severely limit the 
generator, the problem becomes one of arbitrary control- and data-flow 
analysis. We discuss why the limitations of current meta-programming 
tools that offer safety guarantees [1,4] are too strict and we present pos- 
sible avenues for future research. 

For further reading, a full paper accompanying this talk can be found in 
the PEPM’04 proceedings. The reader may also want to consult one of 
the good surveys on program generation, examining the topic either from 
an applied perspective [3] or from a partial evaluation perspective [2]. 
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Abstract. Two trends are emerging in the World Wide Web (WWW). 
The first is the proliferation of Web Services - self-contained, Web- 
accessible software applications and associated distributed systems ar- 
chitectures. The second is the emergence of the “Semantic Web,” the 
vision for a next-generation WWW that is computer interpretable. To- 
day’s Web was designed primarily for human use. To enable reliable, 
large-scale automated interoperation of Web services, their properties 
and capabilities must be understandable to a computer program. In this 
talk we briefly overview our ongoing work to develop a declarative lan- 
guage for describing Web services on the Semantic Web, contrasting it 
with emerging industrial Web service and Semantic Web standards. Our 
declarative representation of Web services enables automation of a wide 
variety of tasks including discovery, invocation, interoperation, composi- 
tion, simulation, verification and monitoring. 

To address the problem of automated Web service composition, we pro- 
pose automated reasoning techniques based on the notion of generic pro- 
cedures and customizing user constraint. To this end, we adapt and ex- 
tend a logic programming language to enable programs that are generic, 
customizable and usable in the context of the Web. We combine these 
with deductive synthesis techniques to generate compositions of Web 
services. Further, we propose logical criteria for these generic procedures 
that define when they are knowledge self-sufficient and physically self- 
sufficient. To support information gathering combined with search, we 
propose a middle-ground interpreter that operates under an assumption 
of reasonable persistence of key information. Our implemented prototype 
system is currently interacting with services on the Web. Parts of this 
work were done in collaboration with Tran Cao Son, Honglei Zeng and 
Ronald Fadel. 
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Abstract. We redevelop and extend Dams’s results on over- and under- 
approximation with higher-order Galois connections: 

(1) We show how Galois connections are generated from U-GLB-L-LUB- 
closed binary relations, and we apply them to lower and upper powerset 
constructions, which are weaker forms of powerdomains appropriate for 
abstraction studies. 

(2) We use the powerset types within a family of logical relations, show 
when the logical relations preserve U-GLB-L-LUB-closure, and show that 
simulation is a logical relation. We use the logical relations to rebuild 
Dams’s most-precise simulations, revealing the inner structure of over- 
and under-approximation. 

(3) We extract validation and refutation logics from the logical relations, 
state their resemblance to Hennessey-Milner logic and description logic, 
and obtain easy proofs of soundness and best precision. 



Almost all Galois-connection-based static analyses are over-approximating: For 
Galois connection, (V{C), C){ao,"f){A, an abstract value a G A proclaims a 
property of all the outputs of a program. For example, even € Parity (see Figure 
2 for the abstract domain Parity) asserts, “Veuen” ~ all the program’s outputs 
are even numbers, that is, the output is a set from {S' G V{Nai) \ S C j(even)}. 

An under- approximating Galois connection, {V{C), D)(a„, 7 )A°^’, where A°p 
= (A, 3a), is the dual. Here, even G Parity°^ asserts that all even numbers 
are included in the program’s outputs - a strong assertion. Also, we may reuse 
7 : A — > V{C) as the upper adjoint from A°p to V{C)°p iff 7 preserves joins in 
{A, Ga) “ another strong demand. 

Fortunately, there is an alternative view of under-approximation: a G A°p 
asserts an existential property - there exists an output with property a. For 
example, even G Parity”^ asserts “3 even” - there is an even number in the 
program’s outputs, which is a set from {S' G V{Nat) \ S G ^{even) yf 0}. 

Now, we can generalize both over- and under-approximation to multiple prop- 
erties, e.g., V{even, odd} = V(even V odd) - all outputs are even- or odd-valued; 
and 3{even, odd} = 3 even A 3odd - the output set includes an even value and 
an odd value. These examples “lift” A and A°p into the powerset lattices, Vl{A) 
and Vu{A), respectively, and set the stage for the problem studied in this paper. 

Supported by NSF ITR-0085949 and ITR-0086154. 
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Concrete transition system: 

L7 — { Co 5 , C 2 } 

R = {(co,Cl), (ci,C2)} 

Approximating the state set, 17, by A = {_L, no, ai 2 , T}; a : V{E) — > A is: 
a{co} = ao, a{ci} = ai 2 = a{c 2 } = a{ci, C2}, o?{ci, C2, C3} = T, etc. 
Over-approximating {“mai/’: 33) transition system: 



A = {_L, ao, ai2, T} 

= {(ao, ai 2 ), (ai 2 , 012 ), (T, 012 )} 



aO 



Under-approximating {“mnsf-. V3) transition system: 

A — |_L, ao, ai2, T} 

R' = {(ao,ai2),(A,A)} 

The mixed transition system is {A, R!^ , R^). 



jD 



I \ 

al2 



al2 



Fig. 1. An example mixed transition system 



1 Dams’s Mixed- Transition Systems 

In his thesis [10] and in subsequent work [11], Dams studied over- and under- 
approximations of state-transition relations, R C C x C, for a discretely ordered 
set, C, of states. Given complete lattice (A, and the Galois connection, 
(7^(C), C)(a, 7 )(A, C^), Dams defined an over-approximating transition rela- 
tion, C A X A, and an under- approximating transition relation, C A x A, 
as follows: 



aR^a' iff a' G |q:(F) | Y G min{S' \ i?^^( 7 (a), S")}} 
aR^a' iff a' G {a(F) | Y G min{S' \ i?'^^( 7 (a), 5")}} ^ 

such that R^ p-simulates R (that is, all /^-transitions are mimicked by R} , modulo 
p C C X A, where cpa iff c G 7 (a)), and R p~^ -simulates R!'. See Figure 1 for 
an example of R and its mixed transition system, /?\ /?**. 

For the branching-time modalities □ (Vi?) and O (3/?), 

a ]= 0(j) iff for all a' , aR^a' implies a' \= (j) 
a ^ Ocj) iff there exists a' such that aRfa' and a' \= 4> 

Dams proved soundness: a \= (p and cpa imply c \= <j). With impressive work. 
Dams also proved “best precision” [11]: For all p- (and p~^~) simulations, R^ 
and Rff preserve the most UO -{mu- calculus [20,21]) properties. 



^ R?^ , and the dehnitions themselves are explained later in the paper. 
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1.1 Can We Derive Dams’s Results 

Within Galois-Connection Theory? 

Given that Dams begins with a Galois connection, it should be possible to re- 
construct his results entirely within a theory of higher-order Galois connections 
and gain new insights in the process. We do so in this paper. 

First, we treat RCCxC as V{C). This makes ■. A ^ Vl{A), 

where ’Pl(-) is a lower (^-ordered) powerset constructor^. 

Given the Galois connection, V{C){ar,"fT)A, on states, we “lift” it to a Galois 
connection on powersets, F[V{C)]{aF[r],^F[T])'PL{A), so that 

1. p-simulates R iff extF[T]{R) ° 7 t ^a^f[v(C)] 1f[t]°R^ 

2. the soundness of a \= □</) follows from Item 1 

R-lest = ^F[r] O extF[r]{R) ° Ir 

We do similar work for : A — s- Vu{A) and Ocf), where Vu{-) is an upper 

(A -ordered) powerset constructor^. 

The crucial question is: What is F’[7^(C')]? That is, how should we concretize 
a set T G Vl{A)? First, we write cpra to assert that c G C is approximated 
by a G A. (For example, for Galois connection, V{C){arT"tT)A, define cpra iff 
c G 7t(o)-) Then, S G V{C) is approximated by T G Vl{A) iff S pf^{t) T, where 

S pf^ (t) T 'Iff for every c G S, there exists a G T such that cprU 




This might suggest that F[P{C)] is just V{C), and the concretization, 'yv{p^) ■ 
Vl{A) — !■ V{C), is 7 -p(p^)(T) = Ibis' I S pF^{r)T}j which concretizes T to the 
largest set that is approximated by T. 

But, as suggested by this paper’s prelude, an alternative is to define F[V{C)] 
as Vl{F{C)), because if an abstract state a G A concretizes to a set of states, 
It (a) C C, then set T G Vl{A) should concretize to a set of sets of states: 



Pl(Pl(C)) 




Fi(A) 



^ Think of the elements of Vl{A) as sets of properties, like V{et»en, odd}, as described 
in the prelude to Section 1. 

Think of the elements of Pu{A) as sets of properties, like 3{even, odd}. 

^ This is the lower half of the Egli- Milner ordering, such that when pT Q C x C equals 
Ct, freely generates the lower (“Hoare”) powerdomain. 
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Let Nat be the discretely ordered set of natural numbers, 
and let complete lattice 

Parity =/ / \ 

' even odd 

\ / 

none 



We have the obvious Galois connection, V{Nat){apar,ypar)Parity, 

where yparieven) = {2n \ n € Nat}, jpar(any) = Nat, etc. We can lift this to: 



7P^(Par) ■■ Pr(Panty) ^ P^(P(Natyn 

where P-^ (Parity) are the superset-ordered, up-closed subsets of Parity, 
and Pi(P(Nat)°^) are the subset-ordered, superset-closed subsets of P(Nat). 
7vu(Par) is defined as follows: 



7vu{Pa.r){} = all subsets of 
D 

7'Pu(Par)W'P-y} = nonempty subsets of Nat 
D 

7 pj^(P;jj,){euen, any} = all sets with 
an even number D 

7 Pt/(Par){e^en, odd, any} = all sets with 
an even and an odd number D 
7vu(Par){none, even, odd, any} = {} 




Fig. 2. An under-approximation of sets of natural numbers by sets of parities 



That is, S S Vl(P(C)) is approximated by T S Vl{At) iff for every set S G S, 
S PvL(r)T. This makes ')p^(^^^(T) = {S \ T}, which concretizes T to 

the set of all sets approximated by T. 

For over-approximation, both approaches yield the same definition of ■ 
A — > Vi^(A), but a sound under-approximation utilizes the seeond approaeh: 




S is under-approximated by T iff for every set S G S, S T, where 

S Pvjj^t) T iff for every a G T, there exists some c G S such that cprO 

Thus, : Vu{A) ^ and 7vu(r)iT) = {S \ S pp^^r) T}, which 

is crucial to Dams’s results. Figure 2 gives an example of the construction. 

® This is the upper half of the Egli- Milner ordering, and when pr G C x C \s Ct, freely 
generates the upper (“Smyth”) powerdomain. 
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1.2 Outline of Results 

Applying the just-stated approach, we redevelop and extend Dams’s results 
[10, 11] within a higher-order Galois-connection framework [9]: 

1. We show how Galois connections are generated from U-GLB-L-LUB-closed 
binary relations (cf. [8,23,28]). 

2. We define lower and upper powerset constructions, which are weaker forms 
of powerdomains appropriate for abstraction studies [9, 15,25]. 

3. We use the powerset types within a family of logical relations, show when 
the logical relations preserve the closure properties in Item 1., and show 
that simulation can be proved via logical relations. We incrementally re- 
build Dams’s most-precise simulations with the logical relations, revealing 
the inner structure of under- and over-approximation on powersets. 

4. We extract validation and refutation logics from the logical relations (cf. [2]), 
state their resemblance to Hennessey-Milner logic [17] and description logic 
[3, 6], and obtain easy proofs of soundness and best precision. 



2 Closed Binary Relations Generate Galois Connections 

The following results are assembled from [4,8,14,23,24,28]: Let C and A be 
complete lattices, and let p C G x A, where cpa means c is approximated by a. 

Definition 1. For all c, c' € C, for a, a' G A, for p C C x A, p is 

1. L-closed iff cpa and c' C c imply c' pa 

2. LUB-closed ijfU{c \ cpa} pa 

3. U-closed iff cpa and a O a' imply cpa' 

4 .. GLB-closed iff cp n{a | cpa} 

Proposition 2. For L-U-LUB-GLB-closed p C C x A, C{ap,^p)A is a Galois 
connection, where ap{c) = n{a | cpa} and 7p(a) = U{c | cpa}. 

P 



As the diagram above suggests, U- and L-closure make 7p and Op monotonic, 
and LUB- and GLB- closure make the functions select the most precise answers. 
Note that cpa iff c Gq lp{a) iff o;p(c) a. 

Proposition 3. For Galois connection, C(a,^)A, define paj G C x A as 
{(c, a) I acGa}. Then, p^^ is L-U-LUB-GLB-closed and {oip^^,^p^ff} = (0,7). 
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2.1 Completing a U-GLB-Closed p G C X A 

Often one has a discretely ordered set, C, a complete lattice, A, and an obvious 
approximation relation, p C C x A. But there is no Galois connection between 
C and A, because p lacks LUB-closure. We complete C to a powerset: 

Proposition 4. For set C, complete lattice A, and p C C x A, define p C 
V{C) X A as iff for all c€ S, cpa. If p is U-GLB-closed, then p is U-GLB- 
L-LUB-closed and \ A^ 'PiC) is 'fp{a) = {c \ cpa}. 

Figure 3 shows an application. There is no implementation penalty in apply- 
ing Proposition 4, because the abstract domain retains its existing cardinality. 



p C Int X Sign 
n p neg, for n < 0 

0 p zero 

p ppos, ioT p > 0 

1 p any, for i € Int 



P(lntj 




Fig. 3. Completing p C Int x Sign to p C V(Int) x Sign 



3 Powersets 

Definition 5. For complete lattice, D, a powerset of D is 
PD = (if, Cb,{| ■}: D : E X E ^ E), such that 

— {E, Qe) is a complete lattice 

— {| • [}, the singleton operation, is monotone 

— 1+1, union operation, is monotone, absorptive, commutative, and associative 

— For every monotone f \ D ^ L, there is a monotone ext(f) : E L such 
that ext{f){\d\} = f{d), for all d G D. 

Here are examples of powersets from Cousot and Cousot [9] : 

Down-Set (Order-Ideal) Completion: For d G D, S C D, define }d = {e G 
D \ e G d} and (S' = U{|c? | d G 5}. Define V[{D) = ({(S' | S C D}, C, U). 
Join Completion (Subsets of Vi{D)): {M, C, |, Um), where M. C {(S' | S' C 
D} is a Moore family (that is, closed under intersections)®. 

Figure 4 presents an example. For monotone f : D ^ L, let exti f) : V\ (D) 

L be ext{f){S) = Udesf{d). 

Up-Set (Filter) Completion: For d G D and S F D, define }d = {e G D \ d G- 
e] and }S = U{}d \dGS}. Define V^{D) = ({(S |S C D},D,},U). 

Dual-Join Completion: Subsets of V^{D): (A4, D, I, n^vj), where At C {| 
S I S C £)} is a Moore family. 

For monotone f : D ^ L, let ext{f) : V^{D) — > L be ext{f){S) = r\desf{d). 



Join completions “add new joins” to D; the trivial join completion is {{id \ d G 
D}, C, j, J,oUd), which is isomorphic to D, and the most detailed join completion is 
EiiD). 
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3.1 Lower and Strongly Lower Powersets 

For powerset PD, for d G D and S G PD, define d€S iff = S. 

Definition 6. Powerset Vl(D) = {E,Qe,{\ ' 

1. a lower powerset ijf ((for all xGSi, there exists t/GS '2 such that x Ed y) 
implies Si Qe S 2 )- 

2. a strongly lower powerset ijf ((for all x€Si, there exists yGS '2 such that 

X Ed y) ijf Si Qe S 2 ). 

Although lower powersets are the starting point for powerdomain theory [15, 
25]^, we work with strongly lower powersets®, because 

Proposition 7. For every strongly lower powerset, Vl{D) = {E, ]—e, {[ • [}, W), 

(i) 1+1 = \Ae; and 

(ii) Vl{D) is order-is amorphic to a join- completion of D, where G is €. 
Strongly lower powersets let us generalize Proposition 4: 

Theorem 8. For complete lattices C and A, let p C C x A and let Vl{C) = 

• [[jW) be a join completion (strongly lower powerset). Recall that p C 
Vl{C) X a is defined S pa iS for all c G S, cpa. 

If (i) p is U-L-GLB-closed, and (ii) for all a G A, {c | cpa} G if, then p is 
U-L-GLB-LUB-closed and 7 p(a) = {c | cpa}. 

Thus, Vi{C){ap,jp)A is always a Galois connection for U-L-GLB-closed p C 
C X A, but the minimal }oin completion of sets {c j cpa}, a G A, also suffices to 
generate a Galois connection. 

For example, say that Int and p in Figure 3 are replaced by Int]_ from Figure 
4 and by p C Int]_ x Sign, which is defined to be p augmented by T pany and 
_L p a, for all a G Sign. Figure 4 shows p’s minimal join completion. 

3.2 Upper Powersets 

As Plotkin [25] notes, the upper and strongly upper powersets coincide, so 

Definition 9. Powerset Vu{D) = (A, C_e,{| ■ |},W) is an upper powerset iff 
(Si ^E S 2 iff for all pGS' 2 , there exists xGSi such that x y). 

For an upper powerset, W = FI, and every upper powerset is isomorphic to a 
dual-join completion. 

^ Which requires functions to be Scott-continuous. 

® Which allows non-Scott-continuous, monotone functions. 
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4 Logical Relations 

We attach these typings to the relations introduced in Section 2 : 

T ::= b \ Ti ^ T2 \ Vl{t) \ Vu{t) \ f 

Only typing r is nonstandard; it is a special case of that we retain for 

convenience, because it appears so often in the practice of generating Galois 
connections. 

We attach the typings to concrete and abstract domains, D, as follows: 

Db is given, for base type b 

c^re the monotone functions from to Dt- 2 7 ordered pointwise 
^VL(r) is a strongly lower powerset generated from Dr 
D-p^(^r) is an upper powerset generated from Dr 

Since p C Vl{C) x A is the completion of p C C x A (cf. Theorem 8), we define 

Cf is C-pr^pr)^ for concrete domain Cr 
Ar is Ar, for abstract domain Ar 

Now, we can define this family of logical relations, pr ^ Cr y. Ar'. 

Pb is given, for base type b 

f Pn^T2 /** iff for all c G Cri , a G A^ ,cpr^a implies /(c) pr2 (a) 

S p-Pr(r) T iff for all cGS”, there exists aCT such that cpra 
S Pvr,(r) T iff for all oGT, there exists cGS such that cpr a 
S' Pt a iff for all c G S,cpr a 

Again, note that pf Q Cpr(r) x Ar is an instance of Ppr^(r) Q Cp^D) x Apr(r)i 
where Cp^^r) is treated as a join completion and Ap^^(^r) is restricted to the 
trivial join completion, ({|a | a G Ar}, C, |, which is isomorphic to Ar- 

4.1 Simulations Are Logical Relations 

The standard definition of simulation goes as follows: 

Definition 10. For p C CxA and transition relations, R C C xC , R'^ C Ax A, 
R'^ p-simulates R, written R<p R'^, iff for all c,c' G C, a G A, 

cpa and cRc' imply there exists a' G A such that aR^a' and c' pa'. 

When we represent R and R'^ as R : C ^ 'PriC) and RC, A ^ Vl{A), respec- 
tively, we have 

Theorem 11. R<ip,^R^ iff R pb^p^^ib) R^ 

A dual simulation, if <p-i R, is beautifully characterized as Rpb^p^(b) ■ We 
employ these characterizations of simulation and dual-simulation to construct 
optimal over- and under-approximating transition relations from Galois connec- 
tions generated from closed, logical relations^°. 

® The proof assumes that R and A** behave monotonically. 

Please see Section 8 for a summary of Loiseaux, et al. [22], which also characterizes 
simulations as Galois connections. 
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Fig. 5. An L-LUB-closed relation between strongly lower powersets 



5 Closure Properties of Logical Relations 

Proposition 12. For pr Q CrXAr and for F[t] G {t' — *■ r, Vl{t), Vu{t), f}, 
If Pt is L-closed, then so is Pf[t]- 
If Pt is U-closed, then so is Pf[t]- 

If Pt is U-GLB-closed, then so are pr'^r, Pf, o,nd pf^{t)- 
If Pt is L-LUB-closed, then so are Pt'^t ond pfjj(t)- 

Preservation of LUB-closure for Pfl{t) and GLB-closure for Pfu(t) depend on 
the specific powersets used (cf. Backhouse and Backhouse [4]). 

Here are some additional useful properties: 

Proposition 13. Let Pt^ Q x At^, for i G 1..2, be U-GLB-L-LUB-closed. 
For f :Cti ^ Ct^, /“* : At, At^, 

fPr,^T2 P iff Otp^^ O / /“ o (^Pt, ■ 

In particular, f Pt,^t^ Phest^ where /^est(a) = ° ° Ipt, ■ 

If Pt GCt X At is L-LUB-closed, then so is Pvl{t) f= ^i(Cr) x Vl{At), for 
any choice ofThiAT)', this follows from 

Proposition 14. For all T G Vl{At), let Ct = {S G Vl{Ct) \ S pfp{t) T}. If 

1. Pt is L-LUB-closed; and 

2. for all cG U Ct, there exists aGT such that c = US'^, where Sa Q {c'gS G 
Ct I c' Pt a} 

then Pvl{t) Pl(C't) x Vl{At) is LUB-closed. 

That is, given the lower powerset Vl{Ct), we require UCt PflIgT, for all 
TgAft^(t)- Item 2 says that every element, cG U Ct, is a join of elements that 
are all related to some aGT. By L-LUB closure of Pt, we have cPtU, giving 
LUB-closure for pf^^t)- 

Often we can use a coarser join completion than Vi{Ct) to get LUB-closure. 
For example, for Cjnt = Int\_ from Figure 4 and Ajnt = Sign from Figure 3, the 
version of VL{Sign) in Figure 5{right), requires merely the VL{,Intp) in Figure 
5{left), for LUB-closure of PvL(Sign) TLilntf) x VL(Sign) 

Of course, for Vl{Ct) to be useful for giving the semantics of transition relation 
R C Ct X Ct, '^e require that J,{c' | cRc'} G Vl{Ct), for all c G Ct- 
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Proposition 15. For all S G Vu{Ct), let Qs = {T & Vu{-^ \ S T}. If 

1. Pt is U-GLB-closed, and 

2. for all aG Qs, there exists cGS such that a = UaTc, where Tc C 

{q'GT G Gs \cpr a'}, 

then Pvjj(t) C Vu{Ct) x Vu{rAr) is GLB-closed. 

For all choices of Vu{Ct), Proposition 15 successfully applies to V^{Ar), the 
filter completion of At. But often a coarser, dual-join completion of At will do: 
When giving semantics to transition relation R C Ct ^ Ct, 'we require only that 
i?’s image lies in Vu{Ct). '\{c' \ cRc'} G Vu{Ct), for all c G Ct- This coarser 
domain for Vu{Ct) lets us use a coarser Vu{At) with Proposition 15. 

6 Synthesizing a Most-Precise Simnlation 

Dams [10, 11] proves, for Galois connection V{C){a,^)A and relation RC CxC, 
that the most precise, sound, abstract transition relation i?g C A x A is 

i?Q(a,a') iff a' G {a{Y) \ Y G min{S' \ i?^^( 7 (a), S")}} 

where R^^{M, N) holds iff there exist m G M and n G N such that mRn. 
Recoded as a function, r\\ A ^ Vl{A), and simplified, this reads 

i?o(a) = {o;(s0 I 3s G j(a),s' G R(s)j 

Our machinery gives us the same result: Given U-GLB-closed pi, C C x A 
and transition function R : C ^ T’(C), we generate the Galois connections, 
'P{C){ap-^,^p-JA and 'P(C')(app^,,, , 7 pp^(,,)'Pl(x 1 ), and synthesize the most pre- 
cise, sound abstract transition function, rI^^^ : A Vl{A), 

^LsM) = («pp^(6) o ext^(R) ° Ip-Jia) = ^{{lapJs'lG- I 3s G 'yp-,{a),s' G R{s)} 

which is Dams’s definition, when Vl{A) is Vi{A). We have -^Lsr 

As suggested in Section 1.1, we might also derive an abstract transition re- 
lation that is sound with respect to sets of sets: We generate the Galois con- 
nection, 'Pi('P(G))(ap^^(^j, 7 p^^^^,)'PL(A), and for R : C ^ 'P(C'), we generate 

Rlst--^^'PL{A), 

Rie.st = ° • frpRP(C)) °R)° Ipi 

where exth{{\-\}p^(^p(^C))°R)(R) =U^(c) I c G S'}, and app^(^,(S) = n{T | for all 

S G S,S PvTib) T}- That is, ea;ft,({| • '^Vi(v{C)) °R) maps a set of arguments to the 
set of i?-successor sets, and produces the smallest abstract set that over- 

approximates each of the successor sets. We have R and R-l^g^ 
equals the definition seen earlier. 

This development is notational overkill, but there is an important point: 
Simulation equivalence is preserved when a concrete transition function is lifted 
to a function that maps a set of arguments to a set of sets of answers: 
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Theorem 16. R <\p iff Rph^Vpib) 'iff ext-ffR) Rf' 

iff ext-ff{\ ■ o R) R^- 

This idea will prove crucial when working with under-approximations. 



6.1 Synthesizing a Most-Precise Dual Simulation 

There is a good use for p-p^^p^y. defining a sound, over-approximation analysis of 
under- approximations. 

Consider pp^j^r) ^ RliRuiC)) x Vu(A); it says that S pp^t^T^T iff for each 
set S G S, S Pp^(t) that is, T under-approximates each S G S: 




We can readily construct 

1. Begin with a U-GLB-closed pr C x A] 

2. lift it to a U-L-GLB-closed Pv^(r) f= Ru{C) x Vu{A)', 

3. complete it to a U-GLB-L-LUB-closed Pp^(t) Ri^uiC)) x Vu{A). 

The resulting Galois connection, , 7 p.p^(^j)'P£/(A), is 

lpp^^^^T={S\Spp,(^)T}_ 

= n{T e Vu{A) I for all S G S,S pp^(r)T} 

Figure 2 presents an example. 

Dams proves, for Galois connection P(C)(a, 7 )A and transition relation R C 
C X C, that the most precise, sound, underapproximating abstract transition 
relation, i?Q C A x A is 

i?o(a,a') iff a' G {a{Y) \ Y G min{S' \ i?^^( 7 (a), S")}} 

where holds iff for all m G M, there exists n G N such that mRn. 

Recoded as a function and simplified, this reads 

Riffa) = {a{Y) \ Y G min{S' \ for all s G 7 (a), i?(s) G S" yf {} }} 

Our machinery gives us the same result: We generate the Galois connection, 

Note that C is a set, so V{C)°p is an upper 
powerset. For transition function, R : C —f V{C), we generate this most precise, 
sound under-approximating abstract transition function, : A — > Vi (A), 

Rbest = ° ^ ° ° "fPi 
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where {\ ■ \} o R°p : C ^ Vi{V{C)°p) is ({| • o R°p){c) R{c) = R{c), and 
ext{{\ ■ 1} o R-P) : V(C) ^ is ext{{\ ■ o i?°P)(^) =ip(c)op {i?(c) | c G 

•S'} = {S' 2 R{c) I c G Sj, and = n{T \ for all S G S,S p-p^(^b)T}. 

That is, ea;t({| • || o R°p) maps a set of arguments to the set of sets of 
i?-successors, and produces the largest abstract set that under-approxi- 

mates each successor set R{c), for c G 7 pi(a). We simplify and obtain 

Rbestia) = n{T G 'Pt(A) I for all a' G T,for all s G 7 pj(a), i?(s) n -fpj,{a') yf {}} 

which is provably equal to Dams’s definition^^ . 

Finally, dual simulation lifts to sets of arguments: 

Theorem 17. R!' <p-i R iff Rpb^Vu(b) R^ iff ea:t({| • } o R°p) Pb^p^(b) R'’ ■ 

It is a good exercise to attempt to define a Galois connection from a Pvu(b)', 
the result is usually degenerate because LUB-closure is over-constraining^^. 

7 Validation and Refutation Logics 

Hennessey and Milner proved that DO-propositions {Hennessey-Milner logic) 
characterize transition relations up to bisimilarity [17]. Loiseaux, et al. [22], 
proved that all D-properties true of an over-approximating transition relation 
are preserved in the corresponding concrete transition relation and that when 
one over-approximating transition relation is more precise than another, then 
the first preserves all the D-properties of the second. Dams extended this result 
to under-approximations and O-properties and proved that his definitions of 
^best ^best possess the most DO-propositions of any sound, mixed transition 
system. 

In this section, we manufacture Hennessey-Milner logic from our family of 
logical relations (cf. [2]) and obtain the above results as corollaries of Galois- 
connection theory. Recall that these are the typings of the logical relations, 

r ::= & j n ^ T 2 1 'Pl(t) ] Vu{t) \ f 

where f is an instance of For each of the first four typings, we define a 

corresponding assertion form, producing this assertion language, 

</> ::= Pb I f-(p I V(/) 1 3^ 

and the following semantics of typed judgements (let Dt be either Cr or Ar): 
d \=b p is given, for d G Db 

d \=n^T 2 f-4^ if f {ff) \^t 2 fb^P bL G Dt\ ond f G 
S \=VL{r) '^4> if for all dGS, d \=r <f>, for S G 
S \=Vu(r) ^4’ if there exists dGS such that d \=r 4>^ for S G 

Ro{a) belongs to and is G all elements in RLst(®)- 

Consider Figure 2 and pVu(Parity) ^ V{Nat)°P x V^{Parity)\ What is the least set 
of natural numbers that “witnesses” {even,any}l {0}? {2}? LUB-closure fails. 
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Since f is an instance of Vl{t), we define its judgements for abstract values as 

a \=f 4> if a \=r (f), for a € A^. 

and for concrete values as 

S \=f (j) if c \=r (f, for all c € S, S € Vl{Ct) 

We might abbreviate d by d |= (as in description 

logic [3]) or by [R\<f> {Hennessey-Milner logic [17]) or by □(/) when the system 
studied has only one transition relation, R C Dr x Dr {CTL [7]). This hides the 
reasoning on sets. Similarly, d \=r^'Pu(r) R-^4> can be abbreviated by d |= 3R(j) 
or {R)4> or Ocj). 

The judgements for and 3(j) employ i?** and R!’ , respectively, to validate 
the assertions, motivating Dams’s mixed transition systems 

7.1 Soundness of Judgements 

Assume for all types, r, that the logical relations, pr C Ct x Ar, are defined. 
Assume also, for all function symbols, /, typed t\ T 2 , that there are interpre- 
tations / : Cr^ Cr- 2 , and /•* : Ar^ Ar^, such that f p^^r^ /**• (Functions / 
and /•* are used in the semantics of f-4’-) 

Definition 18. Judgement form \=r' 4> is sound iff for all c G Cr, a G Ar, 
(a \=r! 4> holds true and cpra) imply that c \=ri 4> holds true^^. 

Assume that \=t p is sound for the choice of pt C Cb x Ab- 

Theorem 19. For all types, t, all judgement forms, \=r 4>, are sound. 

The proof is an easy induction on the structure of r. 

We can add the logical connectives, 

d |=T 4>i A </>2 if d \=r 4>i and d \=r <p 2 
d |=T <('1 V 4>2 if d 1=1- (fi or d \=r 4>2 

and prove these sound, but we will require a dual logic, a refutation logic, to 
define a sound semantics for we do so momentarily. 

7.2 Best Precision of Judgements 

Say that a judgement form, \=r' 4>, is monotone if a |=i-' (j) and a' Qr a imply 
a' \=r' (j), for all a, a' G Ar We assume that all base-type judgements, \=b Pb, 
are monotone, and from this it follows that all judgement forms are monotone. 
As a consequence, we have immediately Dams’s best-precision result: 

Theorem 20. For Galois connection, V{C){a,^)A, and every R : C ^ R{C), 
^best ■ ^ RriA) and Rlgs^ ■ A Vu{A) soundly prove the most typed 
judgements, a \=r (f, for all a G A and choices ofVriA) and Vu{A). 

For set, Cr, V{Cr) is a strongly lower powerset and V{Cr)°^ is an upper powerset, 
so we can readily validate and 30-properties on concrete sets, also. 

The judgement form, f-f, shows that r' need not be r. 

The intuition is that 'Jpria') C •yp^(a) C [0] C Cr- 
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7.3 Validating — k/) Requires a Refutation Logic 

For c € C, we define c \=t -•(p iff c p. 

The logic in Section 7 validates properties, so we might have also a logic that 
refutes them: Read a 4> as “it is not possible that any value modelled by 

a & At has property 4>A 



ah7°*T 


is given, for a € At 




« hhr-ir. 


f4 


*//“(a) R-r 


p, for 


aGAT,,pG At,^. 


rp 1 —ipOS 

^ ^Vu{r) 


V0 


if exists oGT,( 


3 h7“ 


p, for T G A-p^i^t) 


rp 1 ^POS 


LU 


if for all a&T, 


a hT“ 


^ p, for T G Apt.{t) 


a h7°* 


if a hT""* 4>, for a& At 





In the refutation logic, the roles of Vl{t) and Vu{t) are exchanged. 

Definition 21. 4> is sound iff for all c € Cr, a & A, cprO and a 4> 

imply c ^t' <P- 

Proposition 22. For all types, t, (p are sound and monotone, assuming 

that the base-type judgements, Pb, are^"^ . 

Corollary 23. The judgement definitions, 
a '^T --(p if a (p 

a -.p if a hr 

are both sound and monotone. 

The Sagiv-Reps-Wilhelm TVLA system simultaneously calculates validation 
and refutation logics [27]. Indeed, we might combine P-PlIt) Pp^(t) into ppr Q 
V{C) X {Vl{A) X Vu{A)). This motivates sandwich- and mixed-powerdomains 
in a theory of over-under-approximation of sets [5, 13, 16, 18, 19]. 



8 Related Work 

In addition to Dams’s work [10, 11], three other lines of research deserve mention: 
Loiseaux, et al. [22] showed an equivalence between simulations and Galois 
connections: For sets C and A, and p C C x A, they note that 
V{C){post[p],pre[p])V{A) is always a Galois connection^®. 

For i? C C X C and C A X A, simulation is equivalently defined as R is 
p-simulated by iff R~^ ■ p C p-{Rf)~^ Treating R~^ and as functions, 

we can define Galois-connection soundness as 

is a sound over- approximation for R ^ with respect to 7 iff 
pre[R] 07 -iopre[Rf] 

The intuition is that a p implies ')p^{a) n [0] = {}. 

pre[p\ = AT.{c j {o j cpa} C T} is p “reduced” to an nnder-approximation fnnction, 
and post[p] = XS.{a j exists c € S,cpa}. A’s partial ordering, if any, is forgotten. 
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For p, R, Loiseaux, et al. prove 

1. i? is p-simulated by R^ iff is sound for R ^ w.r.t. pfe[p]. 

2. a \= 4> G ACTL [7] implies c\= 4>, for cpa. 

Backhouse and Backhouse [4] saw that Galois connections can be character- 
ized within relational algebra, and they reformulated key results of Abramsky 
[1]: p C C X A is a pair algebra iff there exist a : C ^ A and ^ : A ^ C such 
that {(c, a) I ac Qa a} = P = {(c, a) | c Qc T^}- 

For the category, C, of partially ordered sets (objects) and binary relations 
(morphisms), if an endofunctor, a : C ^ C, is also 

1. monotonic: for relations, R, S C C x C', R C S implies aR C aS 

2. invertible: for all relations, R C C x C', {aR)~^ = a{R~^), 

then a maps pair algebras to pair algebras, that is, cr is a unary type constructor 
that lifts a Galois connection between C and A to one between aC and a A. 

The result generalizes to n-ary functors and applies to the standard functors, 
T X T, T ^ T, Lister), etc. But the result does not apply to Vl{t) nor Vuir) - 
invertibility (2) fails. 

Ranzato and Tapparo [26] studied the completion of upper closure maps, 
p : V{C) —> V{C) Given a logic, £, of form, 4> ::= opi(</>j)o<j<|opd, its 
semantics, | • ] QV{C), has format 

|opi((/>j)] = fi([</'il)o<i<|opi| 

where each fi : ^ ^(C) gives the semantics of connector opi. The 

abstract semantics has form, \opi{(j)j)Y' = (p o and G p\P{C)\. 

Upper closure p is £- preserving if, for all S C C, pS C |(/)]'^ implies SC |0], 
and it is L-strongly preserving if the implies is replaced by iff. 

Given an T-preserving p, Ranzato and Tapparo apply the domain-completion 
technique of Giacobazzi and Quintarelli [12] to complete p to its coarsest, strong- 
ly preserving form: complete{p) = gfp{\p.p □ Al(i?{fj}(p))), 

where □ operates in the complete lattice of upper closures, Ai is the Moore 
completion, and Rf{p) = {f{x) \ f G F,x G p[P{C)(^^} adds the image points 
of the logical operations, fj, to the domain. 

This technique can be applied to the present paper to generate strongly 
preserving, over- and under-approximating Galois connections. 
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Abstract. In this paper we study the relation between the lack of com- 
pleteness in abstract symbolic trajectory evaluation and the structure 
of the counterexamples that can be derived in case of property failure. 
We characterize the presence of false negatives as a loss of completeness 
of the underlying abstraction. We prove how standard completeness re- 
finement in abstract interpretation provides a systematic way for refining 
abstract symbolic trajectory evaluation in order to gain completeness for 
the properties of interest. 

Keywords: Abstract Interpretation, Completeness, Domain Refinement, 
Symbolic Trajectory Evaluation, Verification, Model-checking, Data Flow 
Analysis. 



1 Introduction 

Symbolic trajectory evaluation (STE) provides a means to formally verify prop- 
erties over sequential systems [1, 10, 14]. STE is usually presented as one of the 
main alternative to symbolic model checking (SMC). One advantage of STE com- 
pared with SMC is that STE is capable of dealing with larger circuits, thanks to 
the complexity of the verification algorithm, which is determined largely by the 
property to be verified. As a drawback, STE is limited in the kind of properties it 
can handle. In recent years, several efforts have been made to extend the expres- 
siveness of STE to the one of SMC, while preserving the benefits of STE. It has 
been proposed a generalized version of STE, called generalized symbolic trajec- 
tory evaluation (GSTE), that extends STE to all w-regular properties, making 
GSTE as powerful as traditional SMC for linear time logic [15-17]. 

In this paper we consider the earlier STE introduced by [10,14], and not 
its generalized, and computationally more expensive, version. An STE property 
represents a set of constraints (pre and post conditions) that has to be satisfied 
along any computational path of the system. In STE is the property that drives 
the algorithm in simulating the computational flow of the system to be verified. 
Pre and post conditions are expressed as predicates on states. When the number 
of states increases, this may lead to a phenomenon similar to the state explosion 
problem in SMC [2,6], making some sort of abstraction mandatory. Once ab- 
straction is introduced, it is important to check if the results of the approximate 
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analysis still hold in the concrete system. Since the relationship between abstract 
and concrete models is traditionally formalized and studied by abstract interpre- 
tation [4], it is natural to observe how abstract symbolic trajectory evaluation 
(ASTE) and STE can be properly related in abstract interpretation theory [1]. 
Abstract interpretation provides here the right framework for proving correct- 
ness of ASTE with respect to STE. 

The Problem 

The idea of ASTE is that of verifying temporal properties against an approxi- 
mated model, which is systematically derived from the concrete semantics of the 
system we want to analyze. This is always achieved by approximating the infor- 
mation contained in its states. Such approximation, formalized in the abstract 
interpretation framework, is proved to be correct but not complete [1], meaning 
that, while the satisfaction of a property on the abstract model implies the sat- 
isfaction of the same property on the concrete one, it is not possible to draw any 
information on the behavior of the real system when a property does not hold 
in the approximate model. In fact it could happen that the abstract model does 
not satisfy the property while the real one does, due to the loss of information 
implicit in the abstraction phase. In this case we say that the abstract analysis 
returns a false negative. The notion of completeness in abstract interpretation 
formalizes the fact that no loss of precision is accumulated in abstract computa- 
tion [5, 8]. This means that the approximation of a semantic function, computed 
on abstract objects, is equivalent to the approximation of the same computa- 
tion on concrete objects. Giacobazzi et al. [8] observed that completeness is a 
domain property, namely that completeness of an abstract interpretation only 
depends upon the structure of the underlying abstract domain, and that it is 
always possible to minimally refine or simplify abstract domains to make them 
complete. 

In this paper we are interested in applying abstract domain transformers to 
refine ASTE in order to make it complete (i.e., no false negatives are possible) 
for the verification of the properties of interest. In order to avoid to re-introduce 
the state explosion phenomenon, the size of the refined domain has to be kept 
as small as possible. Therefore, we are interested in minimally transforming 
abstractions to make them complete for ASTE. In [7] a similar idea has been 
applied to make the abstract model checking algorithm strong preserving with 
respect to the fragment VCTL* of the branching time temporal logic CTL*. 
More recent results [13] have proved that the problem of minimally refining an 
abstract model in order to get strong preservation for some specification language 
T, corresponds precisely to the problem of minimally refining the underlying 
abstract interpretation in order to get completeness with respect to the (adjoint 
of the) logical/temporal operators in E. As far as we know, no applications of 
these techniques are known in the field of STE. 

Main Results 

In ASTE, soundness means that the satisfaction of a property in the abstract 
model implies the satisfaction of the same property in the concrete one. How- 
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ever, the approximation may not be complete, later called strong preserving as 
in SMC. In fact, if the property of interest is false in the abstract model, this 
failure may be caused by some particular computations in the approximated 
model, which do not arise in the concrete one. Similarly to the abstract model 
checking case [3,6, 12] the traces of states corresponding to these computations 
are called spurious. A property in STE is expressed by defining preconditions 
and postconditions that have to hold along every computation trace. The ver- 
ification of a property is then performed by checking that, for every possible 
trace, as long as it satisfies the preconditions it also satisfies the corresponding 
postconditions. Therefore, when the property is not satisfied, this means that it 
has been reached a state that does not satisfy the corresponding postcondition. 
In this case the STE algorithm does not explicitly return a counterexample, even 
thought this can be derived if it is known the point of the computation where 
the postcondition fails. Hence, as well as in SMC, strong preservation in STE 
will correspond to the absence of spurious counterexamples. We prove that spu- 
rious counterexamples can be removed in ASTE by refining abstractions. Since 
the logic used by STE is far less expressive than VCTL*, it turns out that the 
completeness requirement for the abstraction is weaker than the one in [7, 13] 
making abstract model checking strong preserving. This means that we only 
need to be precise for a smaller set of logical operators than VCTL*, making the 
refined abstraction weaker, and therefore more efficient when applied to ASTE 
algorithms. 



Structure of the Paper 

The paper is organized as follows: in Section 2 we recall the main notions con- 
cerning abstract interpretation theory and standard completeness as a domain 
property, showing a constructive method to minimally modify domains in order 
to achieve completeness. In Section 3 we present STE in its standard and ab- 
stract version, which is sound but not complete. Then we show how both STE 
and ASTE can be expressed by classical data flow analysis (DFA). In Section 
4 we define a systematic method, derived from the standard completeness re- 
finement of the abstract interpretation, for refining the STE abstract model of 
the system in order to gain strong preservation. We conclude in Section 5 with 
related works and a discussion of possible further investigations in this field. 

2 Abstract Interpretation and Completeness 

In the standard Cousot and Cousot’s abstract interpretation theory, abstract do- 
mains can be specified either by Galois Connections (GCs), i.e. adjunctions, or by 
upper closure operators (uco) [4]. A complete lattice, denoted (D, <, V, A, T, T), 
is a set D equipped with an ordering relation <, where for any S Q D: \/ S is 
the lub of S, AS is the gib of S, T is the greatest element and T is the least 
element. A Galois connection is given by (G, a,A, 7 ), where the concrete and 
abstract domain C and A are complete lattices related by a pair of adjoint 
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maps. The functions a : A — > C and 7 : C — > A form an adjunction when for all 
a € A, c S C: a(c) <a a c <c 7(a)- In this case a and 7 are the monotone ab- 
straction and concretization maps [ 4 ] . When each value of the abstract domain A 
is useful in representing C, namely when Vo € A. a(j(a)) = a, then (C, a, A, 7) 
is a Galois insertion (GI). An upper closure operator on a poset C is an operator 
p : C ^ C which is monotone, idempotent, and extensive (Va; G C. x < p{x)). 
The set of all upper closure operators on C is denoted by uco{C) . Given a com- 
plete lattice ( 7 , it is well known that {uco{C), G, U, □, Ax.T, Xx.x) is a complete 
lattice. The ordering on uco{C) corresponds precisely to the standard order used 
in abstract interpretation to compare abstract domains with regard to their pre- 
cision: A is more precise (or concrete) than i? iff A C i? in uco{C). Each closure 
is uniquely determined by the set of its fix-points p{C), in particular A C C is 
the set of fix-points of an upper closure p on C iff A is a Moore-family of C, 
i.e., A = M(A) = {AS \ S C A} — where A 0 = T G M(A), iff A is isomorphic 
to an abstract domain A in a GI (C, a, A,y), i.e. A = p{C) with t : p{C) A 
and : A — > p{C) being an isomorphism. In this case (C, top, A, (.“^) is a GI 
where p = 700. Therefore ucoiC) is isomorphic to the so called lattice of abstract 
interpretations of C [ 5 ] . In this case A C i? iff C A as Moore families of C, 
iff A is more concrete than B. Recall that given a complete lattice C, the down- 
ward closure of 5 C (7 is defined as | S' A G (7 | € S', a; < p}, and |a: is a 

shorthand for |{a;}. Recall that a function / G (7 — > is continuous (additive) 

if / preserves tub’s of nonempty chains (arbitrary sets). 

Let (C, a, A, 7) be a GI, / : (7 ^ (7 be a continuous function and /•* : A ^ A be 
a corresponding abstract function on the abstract domain A. Then (( 7 , a, A, 7) 
and f^ provide a sound abstraction of / if aof < f^oa. Gompleteness is guar- 
anteed when such condition is satisfied with equality, namely (( 7 , a, A, 7) and 
/** are complete for / if aof = f^oa [8]. It as been proved that completeness 
is a property of abstract domains [8]. In particular there exists f^ : A ^ A 
such that (C, a,A,7) and /** are complete for / iff aof = 00/0700; [8]. This 
result constructively characterizes the structure of complete abstract domains 
for continuous functions. Recall that, if / : (7 — > (7 is a unary function, then 
f-\y) = { x\f{x) = y}. 

Theorem 1 ([8]). Let f : C ^ C be continuous and p G ucoiC). Then p is 
complete for f «if Uy6p(c) rnax{f-^{l y)) C p(( 7 ). 

By closure under (maximal) inverse image of / we get the most abstract domain 
which is complete and includes the given domain [8]. Let Jij : uco{C) — *■ uco{C), 
be defined as IR/ = AA G uco{C). max{f~^{l y))). It has been proved 

in [8] that if / : (7 ^ (7 is a continuous function and A G uco(( 7 ) then: 

A is complete for / and ACAifFA = An 1 R/(A) 

Therefore, the greatest (viz, most abstract) domain which includes A and which 
is complete for / is §/(A) = gfp{XX. An IRy(A)). This domain is called the 
complete shell of A with respect to /. 
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3 STE and Abstract-STE 

In this section we present STE through the formalization adopted by Chou in 
[1] , where this technique was defined both in its concrete and abstract version by 
abstract interpretation. Chou proved that it is possible to express the STE algo- 
rithm as a DFA problem, making the use of abstract interpretation in modeling 
STE problems the most natural. 



Symbolic Trajectory Evaluation 

Given a system M, let S denote the set of all its states, which is nonempty and 
finite. A state is an assignment of values to variables (signals) and it can be 
represented as a boolean vector, where each element corresponds to a particular 
system signal that can assume either the value 0 or 1 . Let i? C Sx She the tran- 
sition relation on M, where (s, s') € R means that M can in one step move from 
state s to state s' . The function post[R\ : p{S) p(T’) is the forward predicate 
transformer associated with R, where post[R]{X) = {s G A|i?(a;, s) A x £ X}. 
Dually, we can define the backward predicate transformer associated with R as 
pre[i?] : p{S) —>■ p{S), where pre[i?]( A) = {s G E\ix G S.R{s,x) => x G X}. It 
is well known that these two predicate transformers form an adjunction, namely 
that (p{S),post[R], p{S),p^[R\) is a GC [11]. Let (p(S),C) be the complete 
lattice of all possible predicates over states. A system in STE is modeled by a 
function M : p{S) — > p(S) defined as: M{p) = post[R]{p). Note that M is an 
additive function that, given a state s £ S, returns the least specified predicate 
the system can evolve to. Given a sequence of elements s, we denote with s[f] 
the i-th element of such sequence. A trajectory in M is a nonempty sequence of 
states, r G A+, such that Vi G N : 0 < i < jrj T[i] £ M({r[i— 1]}). The set of 
trajectories of M is denoted by Traj{M). It is possible to define a partial order 
on the trajectories of a system, extending the existing order on p{S). Given two 
trajectories r and r' of the same length: 

r C r' Vi G N : 0 < i < |r| r[i] C r'[i] 

The properties of the system are expressed through a particular labeled graph, 
called assertion graph or trajectory assertion [1, 10]. A trajectory assertion for M 
is a quintuple G = {V,vq, Rg, TTa,'^b), where E is a finite set of vertexes, vq £ V 
is the initial vertex, Rg C E x E is a transition relation, and tt^ : E — > p{S) 
and 7Tc : E ^ p(^) label each vertex v respectively with an antecedent TTaW), 
also called precondition, and a consequent 7Tc(v), also called postcondition. A 
path of G is a nonempty sequence of vertexes, p £ E'*', such that /r[0] = vq and 
Vi G N : 0 < i < 1^1 {p[i — l],^[i]) G Rg- The set of paths of G is denoted 
by Paths {G). The circuit M satisfies the trajectory assertion G, when for every 
trajectory r in Traj(M) and for every path p in Paths {G), as long as r satisfies 
the antecedents in p, r satisfies the consequents in p. Formally the circuit M 
satisfies the assertion graph G, denoted by M ]= G, if and only if [1] : 



Vr G Traj{M) : V/i G Paths(G) : jrj = \p\ (r j=a p ^ t \=c p) (1) 
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Where t a-satisfies (resp., c-satisfies) fi, denoted by r |=a /r (resp., r \=c ^), iff 
T[i] e TTa{li[i]) (resp., r[z] e 7Tc(^[i])) for each i <\t\ = |^|. 



STE as a DFA Problem 



The STE algorithm can be formulated as a DFA problem, namely it is possible 
to investigate the validity of a property on a circuit by solving a standard data 
ffow equation [1]. Let T : (E — > p(L7)) ^ (E ^ p(^)) be defined as: 






C if = i>o 

U{F(z;')(<?(?;'))|('y', f) G Rg} otherwise 



where F : V ^ (p(^) ^ p(^)) is F{v){p) = M{TTa{v) C\ p) for each vertex 
V G V and for all predicates p G p(T'). The fix-point equation = ?(<?) has a 
least solution : E ^ p{^), which is computed as the limit of the following 
sequence {<Pn ■ V p(T') | n G N), where: 



r Ai; G E. 0 if n = 0 
otherwise 



The circuit M satisfies the assertion graph G, denoted M \=c G, iff [1]: 



Mv GV : <P^{v) n 7Ta(t') c TTc{v) 



( 2 ) 



The following theorem proves the equivalence between STE and a DFA problem. 
Theorem 2 ([!]). M|=G <t4>M \=c G 



Abstract Symbolic Trajectory Evaluation 

The state explosion problem limits the efficiency of STE, in fact the STE algo- 
rithm on boolean vector is practical only for small systems [1] . When the system 
becomes large the likelihood for the STE algorithm to encounter the state explo- 
sion problem increases. To overcome this problem some sort of abstraction must 
be applied to the system. Let (P, <) be a complete lattice of abstract predicates 
such that there is a GC (p(A), a, P, 7 ), where a : p(P) — > P and 7 : P — > p{S) 
are respectively the abstraction and concretization maps. The abstract inter- 
pretation of M over P, namely the abstract model, is given by the function 
M : P ^ P, defined as the best correct approximation of M in P [1]: 

M{p) = 'yoa{post[R]{'-f{p))) 

In the following p = 70 a. Note that M does not distribute over arbitrary union, 
i.e., it is not additive. An abstract trajectory assertion for the abstract system 
M is therefore a quintuple G = {V,vq,Rg, ^a, ifc), where the antecedent and 
consequent labeling functions are now given by tTq, tTc : E ^ P. Let us define 
7 (G) = {V,Vo,RG,j{na), where 7 (ita) = Xv G V : 7(7Ta(u)) and j{nc) = 

Xv GV : 'y{TTc{v)) [1]. Note that 7 (G) is a trajectory assertion for M. 

The typical abstraction used in STE is the one that approximates sets of 
boolean vectors with ternary vectors. The possible values of the elements of a 
ternary vector are 0, 1 and X, where X denotes the unknown value, i.e., either 
0 or 1 . 
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ASTE as a DFA Problem 



In order to formalize ASTE as a DFA problem we only need to compute the fix- 
point solution, of the recursive equation defined earlier, on the abstract domain 
P instead of p(T'). Formally [1] we have P \ {V ^ P) ^ {V ^ P)\ 






T if = 'Uq 

V{F(z;')(<?(i''))|('y', f) G Rq} otherwise 



where P : V ^ {P ^ P) is defined as P{v){p) = M{ 7 Ta{v) A p) for each vertex 
V G V and for all abstract predicates p G P. The fix-point equation <P = T(#) 
has a least solution : F — > P, which is computed as the limit of the following 
sequence : V — > P | n G N), where: 

^ _ f Az; G F. T if n = 0 
" otherwise 



The abstract circuit M satisfies the abstract trajectory assertion G, denoted by 
M \=A G, if and only if: 

Vu G F : S^,{v) A na{v) < nc(v) (3) 

It has been proved that ASTE is preserving but not strong preserving for the 
STE algorithm [1]. Namely if the abstract assertion G is satisfied by the abstract 
model M, then the concretization of the assertion, 7(G), is satisfied by the 
concrete model M, but in general the converse does not hold. This means that 
it can happen that M G, while M \=c 7(G). 

Theorem 3 {[ 1 ]). M\= a G => M \=c 7(G) 

The following is an example showing that strong preservation in general does 
not hold in ASTE. 



Example 1 . Consider the circuit M in Figure 1 with five signals Xi,X2,yi,y2,o, 
where X\ and X2 are the input signals, y\ and 1/2 are the results of, respectively, 
the inverter applied to x\ and 12, and o is the output signal computed as the 
AND of j/i and z/2- Suppose that the abstract circuit approximates the concrete 
one by ternary vectors. We want to check if M \=a G, where G is the trajectory 
assertion on the right side of Figure 1. In the graphical representation of G 
we only specify the labels different from XXXXX: these labels are ifaivi) = 
OIXXX, ira{v2) = lOXXX and 7 Tc(?;4 ) = XXXXO. G specifies that when one 
of the two inputs is 0 then the output is 0. It is easy to verify that in this case 
M \=c 7(G) while M G, because of the loss of information implicit in the 
ternary abstraction. In fact when computing the abstract fix-point we have that: 

Ssivs) = lOOIA V OllOA = AAAAA 
while in the concrete computation: 

^>3^3) = {lOOlA} U {OllOA} = {lOOlA, OllOA} 
it is clear how the abstract case loses information. 
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Fig. 1. The circuit and the trajectory assertion G 



4 Completeness in STE 

A model satisfies a trajectory assertion when every state sequence of the model 
satisfying the preconditions satisfies also the postconditions. The checking al- 
gorithm verifies this by generating a simulation sequence that satisfies the an- 
tecedent (i.e, C ^t(v) n 7Ta(v)), and by testing whether the resulting state se- 
quence satisfies the consequent (i.e., C 7Tc(v)). Notice that if a precondition does 
not hold then the property is trivially true (i.e., 0 C 7tc(r')). This suggested us 
that in order to be precise for this kind of properties it is enough to be precise 
for the postconditions. Strong preservation holds if the domain contains all those 
set of states (viz., predicates), that ensure precision for the postconditions along 
the computations. A sequence of abstract predicates is precise for a given post- 
condition in a vertex v, if all the states associated with v and reachable from 
any abstract predicates in the sequence leading to v satisfies the postcondition. 
This is the idea of what we are going to prove. 

We want to investigate the converse of Theorem 3, namely we want to check if 
M \=c 7 (G) = (E, wqj Rgi 7(^a), 7(7 Tc)), knowing that M \=a G, where the func- 
tion 7 : P — > p{E) is the concretization map. From the abstract interpretation 
theory, we know that there is no loss of information in the concretization phase, 
namely for each vertex v € V the set of states represented by na{v) is exactly 
the same set given by 7 (ita) (the same holds for tTc). Therefore, while checking 
Af \=c l{G), we have that: F{v){p) = post[B\{%a{v)C\p) = post[P](7(7ra(u))np). 
For this reason, in the following, with %a and tTc we indicate also and 'yijTc) 

respectively. 

Counterexamples in STE 

When M G, it means that there is at least one vertex that does not satisfy 
condition (3). Let the failure vertex be a vertex reachable from the initial 
vertex iiq, which does not satisfy (3), i.e., <?*(«„) A Tra(vn) ^ itc(vn)- This means 
that there are some states associated with S^{vn) that satisfy the precondition 
but not the postcondition on Vn ■ The abstraet counterexamples are therefore the 
sequences of abstract predicates from ^*(fo)Aita(i'o) to (^*(vn)Aita(vn))'\^c(r'n) 
(see Figure 2). These sequences provide a proof of the failure of the trajectory 
assertion in the abstract model. The problem now is to ensure that at least one 
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Bi B2 B3 




Fig. 2. Counterexamples 

of these sequences corresponds to a possible behavior of the concrete system. 
This happens when there is a concrete counterexample corresponding to the 
abstract one. A concrete counterexample is therefore a sequence of states from 
^*{vq) n7r(uo) to n7TQ(u„)) \ 7Tc(vn), where the fix-point is computed 

on the concrete domain, i.e. it is computed in STE. 

Definition 1. Assume M G and let v„ be the failure vertex. An abstract 
counterexample for Vn is a sequence of abstract predicates associated with each 
vertex Vi, for i = 0 . . .n, having the following structure: Bq . . . {Bn \ itc{vn)), 
where Bi = f>^{vi) f\i:a{vi) and {vi^Vi+i) G Rg- A concrete counterexample that 
corresponds to the abstract counterexample Bq ■ ■ ■ {Bn'^'^c{vn)), is any trajectory 
xq . . . Xn, where Xi G Bi with Bi = n ita{vi). 

Observe that, by definition, Bi C Bi. An abstract counterexample T for a 
property G is spurious if there is no concrete counterexample corresponding 
to it. If there is a concrete trajectory from a state associated to the initial 
vertex vq to a state associated to the failure vertex that does not satisfy 
the postcondition, then also the concrete model does not satisfy 7(G), i.e., 
M ^A G ^ M 7(G). Otherwise we loose strong preservation, since the 
states of the abstract predicate associated to that cause the property failure 
are not reachable in the concrete model, where the trajectory assertion turns 
out to hold. This happens when all the abstract counterexamples are spurious. 

Figure 2 shows how abstract counterexamples are typically derived. Assume 
that vq is the failure vertex. The abstract counterexamples are then given by the 
following sequences Bq, Bi,B2, B3, {Bq \ iTc{ve)) and Bq,Ba, B5, {Bq \ TTc{ve)). 
Strong preservation is guaranteed if at least one of them corresponds a concrete 
computation. 

Identification of Spurious Counterexamples 

In this section we characterize the spuriousness of an abstract counterexample. 
Let T = Bq . . . {Bn \ TTc{vn)) be an abstract counterexample and let be the 
corresponding failure vertex. Let us define the following sequence of predicates: 

Ayj = Bn \ ^c{’^n) 

^n—i — P'^^[R]{^n—i+l) ^Bn—i 
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Bq Bi i?2 B^ 



Fig. 3. Counterexample 

By construction Xn-i C Bn-% is the largest subset of states, associated to Vn-i, 
from which there is a concrete computation to a state in \ 7rc(u„). It is clear 

that Xn-i C pf(i'[B\{%c{vn))-i where the over line stands for set complementation. 
The idea is that T is not spurious iff all the set Xn-i are not empty, as the 
following lemma proves. 

Lemma 1. The following are equivalent: 

(i) T = Bq . . . (Bn \ 7 Tc(vn)) is not spurious; 

(ii) for all 0 < i < n then X„-i yf 0. 

The assumption that all sets Xn-i are not empty trivially implies that Xq 
is not empty. This means that there is a concrete computation from the states 
associated to the initial vertex to the ones in and this trajectory is the 
concrete counterexample we were looking for. On the other hand, let Xn-j be 
the first set to be empty in the above sequence. By definition this means that 
the states in Xn-j+i are not reachable from Bn-j, namely the states leading to 
the failure of the property are introduced by the abstraction. 

Example 2 . Consider the abstract counterexample T = Bq, Bi, B2, {Bs'^itcivs)) 
in Figure 3, where Bi = C TTaivi) and Xi € if. If the concrete system 

includes the transition (x2,X5), then T is not spurious. In fact, all the sets 
Xj are not empty: Xq = {0:11,2:12}, X2 = {a:8,a:9}, Xi = {0:5} and Xq = 
{0:2}, and there is a concrete trajectory 2:2,0:5,0:8,0:11 that corresponds to this 
abstract counterexample. Otherwise, by not considering (x2,o:5), it is clear that 
T becomes spurious. In fact Xq = 0. 

From Lemma I, we derive the algorithm Check- Spurious that, given an ab- 
stract counterexample, checks whether it is spurious or not. 

Check-Spurious(M, T) 

X := Bn\ (7Tc(u„)) 

j := n 

while (X yf 0 and j > 0) do 

j := j — 1 

X ■= pre[R]{X) n Bj 
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if {X ^ 0) 

then return true 
else return false 

The complexity of the algorithm clearly depends on the length of the abstract 
counterexample considered and on the number of the transitions in the concrete 
model. 



Completeness and Counterexamples 

We are interested to force strong preservation by refining abstractions. This 
means that we have to refine the abstractions so that if there is an abstract 
predicate associated to a vertex that does not satisfy the postcondition, then 
there is at least one concrete state associated to that vertex that does not satisfy 
the postcondition and is reachable from the states associated to the initial vertex. 
Recall from [9] that, given two elements a and 6 of a complete lattice L, by 
definition a b = V{®l® A s < 5}. When L is relatively pseudo-complemented 
(or equivalently a complete Heyting algebra), if for any a,b G L: a b € L, such 
that aAc<644>c<a— i- 6 . Itis well known that if L is of the form p{S) then 
a ^ b = —•aU b. 

Lemma 2. Let p G uco{p{S)), c G p, and x G p(S). If c ^ x G p then 
p{c n x) = p{c n p(x)). 

The following lemma gives a way to systematically refine the abstraction in order 
to avoid all the spurious counterexamples relatively to the failure vertex Vn, with 
respect to a trajectory assertion G. 

Lemma 3. Given a trajectory assertion G, if for all G V such that 

(vi,Vi+i) G Rg we have that the element TTa{vi+i) post[R]{13i) G p and 
that lkpost[fl:]({ifc('yn), T}) C p, then every abstract counterexample of the form 
Bq ■ ■ ■ {Bn \7Tc(un)) is not spurious. 

Example 3. This example refers to Example 2 and shows that, when strong 
preservation does not hold, namely when we do not consider the transition 
{x 2 ,x^) in Figure 3, then we contradict the hypothesis of Lemma 3. We have: 
X^ = {xii,a;i 2 }, X 2 = {xsjXg}, X\ = { 0 : 5 }, and Xq = 0. Observe that 
pre[R\{TTc{v'i)) = {xr}, pre^[R]{nc{v3)) = { 0 : 4 }, pfe^ [R]{Trc{v3)) = {xi}. By 
definition computing Bi we have: 

Bi = S,,{vi) A TTa{vi) 

= p{pOSt[R]{S^{vo) A 7Ta(z;o))) n TTaivi) 

= p{pOSt[R]{Bo)) n TTa{vi) 

(x 4 ,X 5 ,Xg} 

Moreover note that posf[i?](Bo)n 7 ra(?;i) = {X 4 } and { 2 : 4 } C pfe^[R]{frc{v 3 )). The 
abstraction p is monotone therefore: p{post[R]{BQ)n7ra{vi))Cp{pre^[R]{itc{v3))). 
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The fact that Bi = p{post[R]{Bo)) n T!-a{vi) 2 I^^[R]{f^c{v3)), contradicts the 
hypothesis TTa{vi) post[R]{Bo) G p and pre^[i?]( 7 Tc(t; 3 )) G p. This because, 
from Lemma 2, when TTa{vi) — > post[R\{Bo) then p(post[i?] (J3o) H 7 Ta(z;i)) = 
p{post[R]{I 3 a))f\^a{vi), therefore p{post[R]{Ba)) f\T!a{vi) C p{^^[R]{t:c{v3))). 
This means that p{post[R\{BQ)) n 7 Ta(wi) C pre^[i?]( 7 Tc(t; 3 )), since for hypothesis 
pre^[i?]( 7 rc(f 3 )) G p. So applying Lemma 3 we have: B\ = {X 4 }, B2 = { 2 ^ 7 }, and 
B3 = {xio}. In this situation M \=a G. 

The algorithm Refine is derived from Lemma 3. This algorithm, given an ab- 
stract spurious counterexample T, refines the abstraction in order to make all 
the counterexamples, leading to the final vertex v„ of T, not spurious. 

Refine(T, M, G, p) 

S ■■= T^c{Vn) 

for j = n downto 1 do 
p := M(p Upre[i?](5')) 

S:=^[R]{S) 
for each (ui,Ui+i) G Rg do 

p := M(pU (7 Tq(uj+i) ^ post[R]{Bi))) 

Thanks to Lemma 3 we have that by adding to the abstract domain all the 
elements in T }) and TTa{vi+i) post[R]{Bi) for all the vertexes, 

we gain strong preservation for all possible counterexamples for G. 

Theorem 4. // {lkpost[fl] ({7 Tc('c), t})|u & V} C p and for each (vi,Vi+i) G Rg 
we have that TTa{vi+i) ^ post[R]{Bi) G p, then: 

\=c l{G) M \=A G 

In other words, by adding to the abstract domain all the elements pre'^ [i?] {ndv)) 
for each v G V, where the index j stands for the iteration of the operator p?e[i?], 
together with all the elements —’iraivi+i) U post[R]{Bi), where (ui,Ui+i) G Rg, 
we gain strong preservation with respect to the property G. This corresponds to 
refine the abstraction with respect to a restricted form of Heyting completion 
(see [9]) and complete shell with respect to post[R] (see [ 8 ]). 

From Theorem 4 we can derive the algorithm PropertyCompl, which refines 
the abstract domain with respect to a given property, making ASTE complete 
with respect to G. 

PropertyCompl(M, G, p) 

for z = 1 to |E| do 
S := TTciVi) 
while {pre[R]{S) yf 0) 
p := M(p U pre[R] {S)) 

S:=^[R]{S) 
for j = 1 to |E| do 
if (vi,Vj) G Rg 

then p := M(p U (fkaivj) — > post[R]{Bi))) 
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In the following example we consider the circuit of Example 1 , and apply Theo- 
rem 4 to the underlying abstract domain in order to reach strong preservation. 

Example 4 - Consider the circuit and the trajectory assertion of Example 1. By 
computing the abstract fix-point solution we obtain at the fourth iteration 
step that the fix-point is reached and for all ?; € E then ^,(u) = XXXXX. It 
results that ^»{v4) A na{v4) ^ ’Kc{v4), therefore, by definition, M G. In this 
case we have two abstract counterexamples, T\ = Bg, Bi, (B4 \ %c{va)) and 
T2 = Bq, i?2, i?3, {B4 \ tTc{va))- Also the solution to the concrete data flow equa- 
tion is reached at the fourth iteration step, where ^4{vo) = <l>4{vi) = ^4(?;2) = 
XXXXX, while $4{v:i) = {1001X,0110X} and ^>4(^4) = {10010,01100}. In 
this case n TTa{v) C Ttc{v) holds for each vertex v £ V, and in particular 

^*{v 4 ) n na{v 4 ) C 7tc(v4), therefore M \=c 7(G). The problem here is that T\ 
and T2 are both spurious, in fact the states in JSa = A 7Ta(z;3) that lead 

to a state that does not satisfy 7Tc(f4), are not reachable in the concrete sys- 
tem, but they are introduced by the abstraction when computing Here 

pre[R\{'kc{v4)) = (OOOOX, OllOX, lOOlXj is not an element of the abstract do- 
main, in fact the ternary vector that abstracts this set of states is XXXXX. We 
can observe that by adding this element to the abstract domain we gain strong 
preservation. 

^3('y3) = p{p{post[R]{’na{vi) A <^2(vi))) V 
p{pOSt[B\{Tra{v2) ^^2{V2))) 

= p{p{post[R]{lQXXX A XXXXX)) V 
p{post[R]{QlXXX A XXXXX))) 

= p(1001X V OllOX) = {lOOlX, OllOX, OOOOX} 



S4{V4) = p{post[R]{Ti:a{v:i) f\S^(v^))) 

= p{post[R]{XXXXX A (lOOlX, OllOX, OOOOX})) 

= p({10010, 01100, 00000}) = xxxxo 

Now we have ^»{v4) A Tta{vi) = XXXXO = ndv4), and therefore M \=a G. 

The following example shows that the converse of Theorem 4 is not true. In 
particular we show that, even thought all the elements Tta{vi+i) — > post[R]{Bi) 
are included in the domain, the lack of completeness for post[R] does not affect 
the strong preservation for the given property. 

Example 5 . Let us consider the same circuit used earlier in Example 1 but a 
different trajectory assertion G, where the set of vertexes V = {vo,vi,V2,V2,V3j 
and the transition relation Rq = |(uo, rii), (W0W2), (fi, "^3), (^2,^3), (w,'!^4)}, 
are the same of the previous example, while the labels different from XXXXX 
are: TTaivi) = lOXXX, ifa{v2) = OOXXX and 7rc(r>4) = XXXXO. It is easy 
to verify that here M \=c G and M \=a 7(G)- At the fourth iteration step, 
the fix-point of is reached and S4{vo) = <?4(r'i) = ^4^2) = XXXXX, 
while ^4(^3) = XOOXX and S4{v4) = XOOXO. For all € E the satisfaction 
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condition holds: ^*{v) A na{v) < t^c(v), i.e., M \=a G. In this case we have 
strong preservation even though pre[i?](7rc(n4)) is not an element of the abstract 
domain. 

5 Related Works and Conclusions 

In this paper we have studied the impact of the standard notion of complete- 
ness in abstract interpretation based STE. It turns out that through domain 
refinements it is possible to achieve strong preservation for ASTE. In particular 
Lemma 3 gives a systematic way for refining the abstract domain in order to 
achieve strong preservation with respect to a particular failure vertex Vn of a 
trajectory assertion G. The idea here is to refine the abstract domain when it is 
necessary for the property of interest. When the STE algorithm returns a nega- 
tive answer, we check if all the corresponding counterexamples are spurious, and 
only in this case we refine. An alternative solution is the one proposed by Theo- 
rem 4, where a larger amount of information is added to the domain at once, in 
order to make it complete for every possible abstract counterexample for G. The 
converse of Theorem 4 does not hold in general. In fact the proposed conditions 
for strong preservation in ASTE with respect to a property G are too strong: 
ASTE can be complete for a trajectory assertion G, even when the hypothesis 
of Theorem 4 do not hold, as shown in Example 5. As future works it would 
be interesting to verify if the restricted completeness for post[R] together with 
the Heyting completion are necessary conditions for strong preservation with re- 
spect to every possible property expressed as a trajectory assertion. This would 
agree with the observation in [13], where the authors proved that an abstraction 
is strong preserving relatively to a given temporal logic fragment L of the /i- 
calculus, if and only if the underlying abstract domain is complete with respect 
to the adjoint of the operators in L. This means that completeness refinement 
minimally transforms the abstract domain in order to make it strong preserving 
for a given logic. In this case we could have an in depth comprehension on the 
structure of the STE underlying logic. In fact, in view of [13], it seems that the 
logic of STE, with basic operations given by conjunction (fl), next (A), and 
domain restriction (^), is strictly weaker than /x-calculus. 

A result analogous to the one obtained here for ASTE with respect to tra- 
jectory assertions, is the one obtained in [7] for abstract model checking (AMC) 
with respect to the fragment VCTL* of the branching time temporal logic CTL* . 
It is well known that strong preservation in AMC is achieved when there are no 
spurious counterexamples [2,6]. In [7] the authors studied the relation between 
standard completeness and strong preservation in AMC, and it turns out that 
the latter one is guaranteed by making the underlying abstract domain complete 
with respect to post[R], 

In this paper we have considered STE and not its generalized version GSTE, 
which is known to handle all w-regular properties. It is reasonable to guess that 
strong preservation for GSTE, with respect to w-regular properties, is related 
to the standard notion of completeness of abstract interpretation. In order to 
better understand the relation between GSTE and STE, it would be interesting 
to study the conditions making their abstractions strong preserving. 
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Abstract. Linear-relations analysis of transition systems discovers lin- 
ear invariant relationships among the variables of the system. These re- 
lationships help establish important safety and liveness properties. Effi- 
cient techniques for the analysis of systems using polyhedra have been 
explored, leading to the development of successful tools like HyTech. 
However, existing techniques rely on the use of approximations such as 
widening and extrapolation in order to ensure termination. In an earlier 
paper, we demonstrated the use of Farkas Lemma to provide a translation 
from the linear-relations analysis problem into a system of constraints 
on the unknown coefficients of a candidate invariant. However, since the 
constraints in question are non-linear, a naive application of the method 
does not scale. In this paper, we show that by some efficient simplifi- 
cations and approximations to the quantifier elimination procedure, not 
only does the method scale to higher dimensions, but also enjoys perfor- 
mance advantages for some larger examples. 



1 Introduction 

Linear-relations analysis discovers linear relationships among the variables of a 
program, that hold in all the reachable program states. Such relationships are 
called linear invariants. Invariants are useful in the verification of both safety 
and liveness properties. Many existing techniques rely on the presence of these 
invariants to prove properties of interest. Some types of analysis, e.g., variable- 
hounds analysis, can be viewed as specializations of linear-relations analysis. 
Traditionally, this analysis is framed as an abstract interpretation in the do- 
main of polyhedra [7,8]. The analysis is carried out using a propagation-based 
technique, wherein polyhedral iterates that converge towards the final result, 
are computed. This convergence is ensured through the use of widening, or ex- 
trapolation, operators. Such techniques are popular in the domains of discrete 
and hybrid programs, motivating tools like HyTech [12] and improved widening 
operators over polyhedra [11, 1]. 

* This research was supported in part by NSF grants CCR-01-21403, CCR-02-20134 
and CCR-02-09237, by ARO grant DAAD19-01- 1-0723, by ARPA/AF contracts 
F33615-00-C-1693 and F33615-99-C-3014, and by NAVY/ONR contract N00014- 
03-1-0939. 



R. Giacobazzi (Ed.): SAS 2004, LNCS 3148, pp. 53—68, 2004. 
@ Springer- Verlag Berlin Heidelberg 2004 




54 



Sriram Sankaranarayanan, Henny B. Sipma, and Zohar Manna 



Alternatively, the fixpoint equations arising from abstract interpretation may 
be posed explicitly, and solved without relying directly on iteration or widening. 
This is achieved through applications of Farkas Lemma in our earlier work [6]. 
Given a template inequality with unknown coefficients, our technique computes 
constraints on the values of the coefficients, such that substituting any solu- 
tion back into the template yields a valid invariant relationship. However, the 
constraints themselves are non-linear with existentially quantified parameters. 
Nevertheless, an exact elimination is possible in theory through quantifier elim- 
ination techniques for the theory of reals [16,5,17]. In practice, however, the 
technique using exact quantifier elimination does not scale to systems with more 
than five variables. 

Fortunately, the constraints obtained in this process, though non-linear, ex- 
hibit many structural properties that can be exploited to simplify and solve 
them. In many cases, a series of simplifications resolves the constraints into a 
linear system. For instance, whenever the underlying transition system is a Petri 
net, the system of constraints resolves into a linear system [14]. This has led us to 
verify transition systems derived from Petri Nets with as many as 40 dimensions 
and 50 transitions. The use of quantifier elimination is clearly inefficient in such 
situations. In this paper, we provide a set of exact and heuristic rules for sim- 
plifying and solving the constraints for general linear transition systems. Most 
of our rules are exact, but their application may not resolve the constraints. 
Therefore, some heuristics are used instead of an exact elimination as a last 
resort. 

At lower dimensions, our technique performs poorly in terms of time and 
space, relative to the propagation-based approach. When the dimension is in- 
creased, our technique not only scales but in some cases, outperforms the propa- 
gation-based techniques. Furthermore, our technique enjoys several advantages 
over related approaches that are very useful for analyzing larger systems, as 
presented in Section 4. The remainder of this paper consists of Section 2 on pre- 
liminaries, Section 3 on the constraint structure and solving rules, and Section 4 
on some experimental results. 



2 Preliminaries 

We recount some standard results on polyhedra, and then define linear transition 
systems, followed by a description of propagation-based analysis techniques. We 
then demonstrate an alternative approach called eonstraint-based analysis. 



2.1 Linear Assertions 

Through this discussion, let {x \, . . . , a;„} be a set of real- valued variables. Con- 
stant reals are denoted by a, b with subscripts, and unknown coefficients by c, d 
with subscripts. Further details about linear assertions can be obtained from 
standard texts [15]. 
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Definition 1 (Linear Assertions) A linear expression is of the form aia;i -I- 
• • • -I- ttnXn + b. The expression is homogeneous iff & = 0, or else it is inhomo- 
geneous. A linear inequality is of the form a [xi 0, where ixi G {>, =, >}. The 
inequality is strict if tx] g {>}. A linear assertion is a finite conjunction of linear 
inequalities. Linear assertions can be homogeneous or otherwise, depending on 
the underlying linear expressions. The set of points in 7^" satisfying a linear 
assertion (homogeneous assertion) is called a polyhedron {polyhedral cone). 

We shall assume that linear assertions do not contain any strict inequalities. It is 
well-known that any polyhedron is representable by a set of constraints (as a lin- 
ear assertion), or by its vertices, and rays (infinite directions), collectively called 
its generators. The problem of computing the generators, given the assertion, 
and vice-versa have been well-studied with efficient algorithms [9]. However, the 
number of generators of a polyhedron can be worst-case exponential in the num- 
ber of constraints (the n-dimensional hypercube is an example) . Basic operations 
on these assertions are computed thus: 

Intersection Combine the inequalities in both the polyhedra. 

Convex Union Combine the generators of the two polyhedra. 

Projection Project the generators of the polyhedron. 

Containment Test every generator of (p\ for subsumption by (p 2 - 
Emptiness A polyhedron is empty iff it has no generators. 

We now state Farkas Lemma, which describes the linear consequences of a linear 
assertion. A proof is available from the standard references [15]. 

Theorem 1 (Farkas Lemma). Consider a linear assertion S over real-valued 
variables xi, . . . ,Xn, 



aiiXi ainXn + > 0 

S: ■■ : : 

“t“ * * * “t“ ajYmXn “t“ bm ^ 0 

When S is satisfiable, it implies a given linear inequality tp : aiXi -I- • • • -I- 

<inXn + ^ > 0 , i.e, S \= Ip, if and only if there exist non-negative real numbers 
Xo,Xi, ... ,Xm such that Qp = pG[l,n], and & = (X]™ i A*6j)-l-Ao 

Furthermore, S is unsatisfiable if and only if the inequality — 1 > 0 can he derived 
as shown above. 

In the rest of the paper we represent applications of this lemma by a table as 
shown below: 



Ao 


1 > 0 




Ai 


aiixi -|- • • • -|- ainXn + > 0 ) 






1 

: : : 1 


r ^ 


Am 


^ml^l “t“ * * * “t“ amn^n bjn ^0 J 






CiXi -1- • • • -1- CnXn + d > 0 <- 
-1 > 0 ^ 


■ Ip, or 
false 
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The table shows the antecedents above the line and the consequences below. For 
each column, the sum of the column entries above the line, with appropriate 
multipliers, must be equal to the entry below the line. If a row corresponds to an 
equality rather than an inequality, we drop the requirement that the multiplier 
corresponding to it be non-negative. 

2.2 Transition Systems and Invariants 

In this section, we define linear transition systems and linear invariants. Our pre- 
sentation concentrates only on linear systems. The reader is referred to standard 
textbooks for a more general presentation [13]. 

Definition 2 (Linear Transition Systems) Let V = {x\, . . . , a;„} be a set of 

system variables. A linear transition system over T is a tuple (L, T, £q, 0), where 
L is a set of locations, T is a set of transitions, each transition r G T is a tuple 
Pr) , such that li,lj G L are the pre- and the post- locations, respectively, 
and Pt is a linear assertion over V iJV , where V denotes the current-state 
variables, and V' the next-state variables. Location £q € L is the initial location, 
and 6? is a linear assertion over V specifying the initial condition. 

Example 1. Let V = {a;, yj and L = {^o}- Consider the transition system shown 
below. Each transition models a concurrent process, that updates the variables 
X, y atomically. 



0 = {x = 0 A y = 0) 

^ = {ti,T2} 

n = (^0,4, [a^' = a: -k 2y A y' = I - yV) 

T 2 = {£o,£q, [x' = X -£l A y' = y-£2]) 

A given linear assertion ip is & linear invariant of a linear transition system 
(LTS) at a location £ iff it is satisfied by every state reaching £. An assertion 
map maps each location of a LTS to a linear assertion. An assertion map p is 
an invariant map if p(£) is an invariant at £, for each £ G L. In order to prove a 
given assertion map invariant, we use the theory of inductive assertions due to 
Floyd and Hoare [13]. 

Definition 3 (Inductive Assertion Maps) An assertion map p is inductive 
iff it satisfies the following conditions: 

Initiation: 0 ^ p{£o). 

Consecution: For each transition t : {£i,£j,pr), p{£i) £\ Pr \= vi^jY ■ 

It can be shown by mathematical induction that any inductive assertion map is 
also an invariant map. It is well known that the converse need not be true in 
general. The standard technique for proving an assertion invariant is to find an 
inductive assertion that strengthens it. For example, the assertion a: -I- y > 0 is 
an invariant for the LTS in Example 1. 
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Iteration -jb 


ri{£o) 


Iteration Type 


0 


o 

II 

II 

H 


Init 


1 


y — 2x > 0, X — y -h 1 > 0, x > 0 


Propagation 


2 


5x -h 3y < 22, x — y -1- 2 > 0, a: > 0 
X -1- 5y > 0, 2x — y -p 1 > 0 


Propagation 


3 


X > 0, 2x — y -h 1 > 0 


Widening 


4 


true 


Widening 



Fig. 1. Sequence of Propagation and Widening Steps for LTS in Example 1. 



Propagation-Based Analysis. These techniques are based on the abstract-inter- 
pretation framework formalized by Cousot and Cousot [7], and specialized for 
linear relations by Cousot and Halbwachs [8]. The technique starts from an 
inital assertion map, and weakens it iteratively using the Post and the Widening 
operators. When the iteration converges, the resulting map is guaranteed to be 
inductive, and hence invariant. Termination is guaranteed by the design of the 
widening operator. Often widening is not used, or replaced by an Extrapolation 
operator, and the termination guarantee is traded-off against accuracy. 



Definition 4 (Post-condition and Widening Operators) The post- condi- 
tion operator takes an assertion tp, and a transition relation Pt. 

post{(p,r) = {AVo){(p{Vo) A pr{Vo,V)) 

Intersection, followed by quantifier elimination using projection computes post. 
However, more efficient strategies for computing post exist when has a special 
structure. 

Given assertions <P{i, 2 } such that ipi |= the standard widening tpiV(p 2 
is an assertion ip that contains (roughly) all the inequalities in (p\ that are 
satisfied by <p 2 - The details along with key mathematical properties of widening 
are described in [8,7], and enhanced versions appear in [11,4, 1]. 

As mentioned earlier the analysis begins with an initial assertion map defined 
by Voi^o) = '9, and r]o{() = 0 for f' yf £q- At each step, the map rji is updated to 
map rji+i as follows: 



ili-eiW 



(op) 



U (postM£j),Tj)) 



where OP is the convex hull (U) operator for a propagation step, and the widening 
(V) operator for a widening step. The overall algorithm requires a predefined 
iteration strategy. A typical strategy carries out a predefined sequence of initial 
propagation steps, followed by widening steps until termination. The choice of a 
strategy is of the utmost importance for minimizing the number of propagation 
and widening steps, in general. 

The method described above was applied to the LTS in Example 1. Using 
the standard widening [8], we obtain the sequence of iterates shown in Figure 1. 
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The result does not change even when the number of initial propagation steps k, 
is increased to 4. Using the widening operator by Bagnara et ah, implemented in 
the PPL library [1], and fc = 4 does not change the result, even if the number of 
propagation steps is increased. Surprisingly, when the number of initial propaga- 
tion steps is reduced to A: = 3, it yields the invariant lla;-|-10y > 0, lla;-|-12y > 0, 
for fc = 2, the invariant 7x + 6y > 0, 7x + 8y > 0. k = 0,1 also produce the 
trivial invariant (true). This demonstrates that an increase in the number of 
initial propagation steps does not necessarily increase the accuracy of the result. 

Constraint-Based Analysis. The framework of abstract interpretation [7] shows 
that any semantic analysis can be expressed as a fixpoint equation in an ab- 
stract domain. Consequently, linear-relations analysis is a fixed point compu- 
tation in the domain of polyhedra. This computation is done by iteration in 
the propagation-based analysis of Cousot and Halbwachs [8]. We propose to 
use Parkas Lemma to generate constraints from the LTS description, directly 
describing the relevant fixed-point. The resulting constraints are solved using 
non-linear quantifier elimination. 

Let C be a set of template variables. A template is an inequality of the form 
ciXi -I- • • • -I- CnXn -l- d > 0, where d € C. A template map, associates each 

location with a template. We shall use r]{£) to denote both the inequality, and 
the template expression at £, disambiguated by context. We reduce the inductive 
assertion generation problem to one of computing those variables for which a 
given template map y is inductive. The answer consists of encoding initiation 
and consecution using Parkas Lemma. 

Initiation: The implication O ^ ?y(^o) is encoded. 

Consecution: Por each r : {£i,£j,pr), the implication r]{£i) A pr \= rj{£j)' is 
encoded. We shall explore the structure of the resulting constraints in detail 
through the remainder of the paper. 

The definition of consecution can be relaxed into two stronger forms: 

Local Consecution: Por transition t : {£i,£j,p), p ^ y{£jY > 0, 

Increasing Value: Por transition r : {£i,£j,p), p ^ viijY > 

Both these conditions imply consecution. Any map in which some transitions 
satisfy these stronger conditions continues to remain an inductive assertion map. 

Example 2. Consider the LTS in Example 1. We fix a template map r]{£o) = 
c\x C 2 y d, C = {ci,C 2 ,d} being unknown quantities. Initiation is encoded 
using Parkas Lemma, 



Ai 


X =0 


} 


A 2 


y =0 




cix -h C2y -I- d > 0 ^ 


- 



resulting in the constraints 



(3Ai, A 2 ) [ci — Ai A C2 — A 2 A d > 0] 
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After eliminating the multipliers, we obtain d > 0 for the initiation constraint. 
Consecution is encoded using Farkas Lemma as 



Ml 


CiX + C2TJ 


-1- d > 0 <- 


- d(4) 


Ai 


X + 2y — 


x' =0 


\ 


A2 


y 


+ y' -1 = 0 


^ Pr^ 






c\x' + C 2 y' + d> 0 ^ 


- d'(^o) 



which produces the constraints 



(3/ii) [/iiCi — Cl = 0 A fj,iC2 + C2 — 2ci = 0 A fj-id — d — C2 < 0] 

After eliminating Ai,A 2 ,/ii, the resulting constraint simplifies to C 2 = ci > 0. 
Similarly, the constraint obtained for T 2 simplifies to ci -I- 2c2 > 0. The overall 
constraint is the conjunction of the initiation and consecution constraints, which 
reduces to ci = C 2 > 0, d > 0. Solutions are generated by ci = 1, C 2 = 1, d = 0, 
corresponding to the inductive assertion a; -I- y > 0 at ^o- 

3 The Constraint System and Its Solution 



In this section, we study the constraint structure arising from the encoding 
discussed briefly in Section 2, and in detail elsewhere [6]. 

We fix a linear transition system U with variables {xi, . . . , Xn}, collectively 
referred to as x. The system is assumed to have a single location ^ to simplify 
the presentation. The template assertion at location is a{c) = cia;i 
Cna^n + d > 0. The coefficient variables {ci, . . . , c„, d} are collectively referred to 
as c. The system’s transitions are {t\, . . . ,Tm\, where Ti : {£,i,pi). The initial 
condition is denoted by 0. The system in Example 1 will be used as a running 
example to illustrate the presented ideas. 



3.1 Deriving Constraints 

We use Farkas Lemma (Theorem 1) in order to derive constraints for initiation 
and consecution, as shown in Example 2. 

Initiation. The case for initiation is relatively straightforward. We encode ini- 
tiation by encoding 6* ^ a(c) > 0. The conditions on c are obtained from the 
application of Farkas Lemma after eliminating the multipliers. In practice, the 
constraints are derived using Farkas’ Lemma. The result is a linear assertion 
over the unknowns c, and the multipliers A. The multipliers are eliminated us- 
ing polyhedral projection. Let 0 = (x = 0 A j/ = 0), and c\x + C 2 y + d > 0. 
The initiation constraint, obtained by using Farkas Lemma is d > 0, as shown 
in Example 2. 
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Consecution. Consecution for a transition n encodes the assertion 
(a{c) >0) A Pi 1= (a(c)' > 0) 

Using Farkas Lemma, the constraints obtained are homogeneous, and involve 
an existentially quantified non-linear parameter pi. We shall term the class of 
constraints thus obtained parametric linear assertions. 

Definition 5 (Parametric Linear Assertion) Let c be a set of variables and 
pi, . . . , Pm be parameters. A parametric linear expression (PL expression) is of 
the form a\ + PiU 2 , where 0 ( 1 , 2 } are (homogeneous) linear expressions over c. 
A parametric linear (in)equality is of the form /? [xi 0, /? being a PL expression. 
A PL assertion is a finite conjunction of PL equalities and inequalities. 

For a transition n, and template 0 (c), the consecution constraints obtained 
through Farkas Lemma form a parametric linear assertion over a single param- 
eter Pi. 

Example 3. We encode consecution for transition T 2 from Example 1. 



M2 


Cix + C 2 y 


+ d>0 ^ 


- d(4) 


Ai 


X — 


x' -k 1 = 0 


\ 


A 2 


y 


1 

+ 

to 

II 

0 


r Pt 2 






c\x' + C 2 v' -k d > 0 <- 


- V'iio) 



which yields the constraints 



3(/i2, Ai, A 2 ) 



P2C1 -l- Ai = 0 
M2C2 + A2 = 0 
— Ai = Cl 

— A2 = C2 
P2cl — d Ai -t- 2 A 2 ^ 0 



Eliminating Ai, A 2 yields p 2 C\ — ci = 0 A P 2 C 2 — C 2 = 0 A p 2 d— d— c\ — 2c2 < 0. 

These constraints are parametric linear. Local and increasing consecutions can 
be enforced by setting p 2 = 0,1, respectively. 

The Overall Constraint. The overall constraint obtained is the conjunction of 
the constraints obtained from initiation and consecution for each transition. This 
constraint is a combination of several types of constraints. Initiation results in 
a linear assertion, whereas each consecution condition results in PL assertions 
over parameters M = {pi, . . . ,pm}, the parameter pi arising from Tj. Each of 
these parameters is required to be nonnegative, and is existentially quantified. 
In order to compute the actual constraint over c, the parameters in M need to 
be eliminated. 
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Example 4- The overall constraint for the system in Example 1 is now 



3(mi.M2) 



/ \ 





pici - Cl = 0 




P2Cl - Cl = 0 




d ^ q A 


PlC 2 + C 2 — 2ci = 0 


A 


P 2 C 2 — C2 = 0 




pid — d — C2 <0 


P2(i — d — Cl — 2c2 < 0 




Initation 

S. 


Pi > 0 


✓ S. 


P2 > 0 


✓ 



Tl T2 



3.2 Exact Elimination 

The constraints in Example 4 are non-linear and existentially quantified. How- 
ever, the theory of non-linear assertions over reals admits computable quanti- 
fier elimination, as shown by Tarski [16]. Many others have improved the al- 
gorithm [5,17]. Packages like redlog and QEPCAD can handle small/medium 
sized examples. In our earlier work, we used these techniques to handle the con- 
straints derived from elimination. However, there are many drawbacks to using 
these tools. 

1. The technique does not scale to systems of more than five variables. 

2. The technique yields large formulas with many non-linear constraints that 
cancel in the final result, leading to much redundant effort. 

3. The structure in the constraints is not fully utilized. The constraints are of 
low degree, and exhibit a uniform structure. This is lost as soon as some of 
the parameters are eliminated. 

4. In case the underlying LTS has some special structure, the use of elimination 
may be completely unnecessary, as demonstrated for the case of Petri Net 
transitions in [14] . The result can be extended to cases where a subset of the 
transitions have a Petri-net like structure, as is the case with many systems. 

Of course, the completeness of quantifier elimination, and Farkas Lemma lead to 
theoretical claims of completeness (see [6] for details). We are not aware of any 
alternative exact procedure for solving these constraints precisely. Therefore, we 
shall concentrate on under-approximate elimination. 

3.3 Under- Approximate Elimination Technique 

Any under-approximate elimination technique is sound. 

Lemma 1. Let p) be the overall constraints obtained from encoding induc- 
tiveness. Let (p{c) be an assertion such that 

ip{c) h (3 p > 0) 

Any solution to (p is an inductive assertion. 

Proof. Let c = a be a solution to p. Then, there exist positive parameters p 
such that holds. The rest follows by the soundness of our constraint 

generation process. See [6] for a proof. 
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We first split the overall constraints ip^c, fi) into different groups: ip{eq,in}, 
7{eg.m}, and 'tpf,: 

— (peq and ipin contain the equalities and inequalities, respectively, on c. We 
assume (Peq |= ipin- 

— jeq and 'Jin contain the PL equalities and inequalities, respectively, over c 
and fjb. 

— ipfi contains the constraints on conjunctions of linear inequalities, equali- 
ties, and disequalities, where the disequalities are produced by our constraint 
solving rules. 

Example 5. The constraints from Example 4 are classified as follows: 



d > 0 A 

Vin 



pLlCi - Cl = 0 

pLiC2 -I- C2 — 2ci = 0 

H2C1 — Cl = 0 

M 2 C 2 - C2 = 0 

7eg 



/iid — d — C2 < 0 
p,2d — d — Cl — 2c2 < 0 

V 

'yin 



A 



Hi>0 
^2 > 0 






The linear part of a system of constraints is defined as the constraint A ipeq- 
The system is unsatisfiable if j}{m,eq} or are, and trivial if (peq is of the form 
Cl = . . . = c„ = 0. The only inductive assertion that a trivial system can possibly 
yield is 1 > 0. 



Constraint Simplification. The simplifications involving equalities in Lpf.g, 
are the following: 

1. Every equality expression in ipf.g of the form oiCi -I- • • • -I- a„c„ -I- a„_|_id = 0 
forms a rewrite rule of the form Ci —>■ — ^l^Ci+i — ■ ■ ■ — d, where i is the 
smallest index with Oi yf 0. 

2. Apply this rule to eliminate c^ over the linear and PL parts. Simplify, and 
repeat until all the equalities have been converted. 

Similarly, a constraint of the form pi = ainip^is used to rewrite pi in j{eq,in} ■ The 
constraints added to ^{eq^in} can trigger further simplifications and similarly, 
constraints in j^q can be used as rewrite rules in order to simplify constraints in 
'Jin- 



Factorization and Splitting. A PL expression is factorizable iff it can be written 
in the form (/i — a)a, where a is a linear expression over c. Deciding if an 
expression is factorizable is linear time in the expression size. A PL equality 
{pi — a)a = 0 factorizes into two factors p, — a = 0 V a = 0. Similarly a PL 
inequality {p—a)a > 0 factorizes into {p — a > 0 A a > 0) V {p—a < 0 A a < 0). 
Since our system of constraints is a conjunction of (in)equalities, factorization 
splits a constraint system into a disjunction of two systems. The following is a 
factorization strategy, for equalities: 
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1. Choose a factorizable expression (/r — a)a = 0, and remove it from the 
constraints, 

2. Create two constraint systems, each containing all the remaining constraints. 
Add jjL = a to one system, rewriting all occurrences of /i by a. Add a = 
0 A fi^ato the other system, and simplify. 



Example 6. The constraint system in Example 4 has a factorizable equality 
M 2 Ci — Ci = 0, for i G {1, 2}. We add /Z 2 = 1 to one child, and ci = 0 A C 2 = 0 
to the other, yielding 



d > 0 1 


1 




0 

II 

0 


1 1 


—Cl — 2 c2 <0 J 


r ^in 




C2 = 0 J 


'Meg 


Mici — Cl = 0 1 


\ leq 




d > 0 ^ 


^in 


M 1 C 2 -1- C2 — 2ci = 0 J 


V 


Hid — d < 0 1 


lin 


Hid — d — C2 < 0 ^ 


'~iin 




H2d — d < 0 j 


Ml > 0 1 


1 




Ml > 0 1 


1 


M2 = 1 J 




M2 1 J 



The constraints on the right are trivial. The system on the left can be factorized 
using the equality p-ici — ci = 0. We obtain: 



2c 2 — 2ci = 0 ^ 

d > 0 

—Cl — 2 c2 < 0 > 

-C2 < 0 ^ 

Ml = 1 \ 

M2 = 1 j 



'Meg 

^in 




V 



Cl = 0 

d > 0 

—Cl — 2 c2 < 0 

MiC2 + C2 = 0 
pild — d — C2 < 0 
Ml > 0 
M2 = 1 
Ml 1 



'Meg 

^in 




The system on the left (unsimplified) has been completely linearized. The system 
on the right can be further factored using / 11 C 2 -I- C 2 = 0, yielding pLi = —1 on 
one side, and C 2 = 0 on the other. Setting /ii = —1 contradicts pLi > 0, while 
setting C 2 = 0 makes the system trivial. Therefore, repeated factorization and 
simplification yields the linear assertion ci = C 2 A ci > 0 A d > 0, which is 
equivalent to the result of the exact elimination. 



Simplification and factorization can be repeatedly applied to split the initial 
constraints into a tree of constraints, such that each leaf has no more rules appli- 
cable. Each node in the tree is equivalent to the disjunction of its children. There- 
fore, the root is equivalent to the disjunction of all the leaves. The leaves can be 
either completely resolved (linear), unsatisfiable, trivial, or terminal. A terminal 
leaf is satisfiable and non-trivial, but contains unresolved non-linearities, which 
cannot be further split or simplified. 



Handling Terminal Constraints. There are many ways of handling these con- 
straints, some exact and some under-approximate. 
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# 


name 


description 


1 


Simplification 


Substitute equalities into Expressions 


2 


Factorization 


Choose a factorizable expression, and split disjuncts 


3 


Subsumption 


Test containment of linear part w.r.t fully-resolved nodes 


4 


Split 


Use lemmas 2,3 to split 


5 


Instantiate 


Set Pi = 0, 1, split and proceed 



Fig. 2. Constraint-simplification rules. 



Subsumption. If a terminal (or even a non-terminal branch) has its linear part 
subsumed by another fully linear leaf, we can ignore it without loss of accu- 
racy. Checking subsumption allows us to eliminate non-terminal nodes too. Even 
though polyhedral containment is expensive for higher-dimensions, we find that 
a significant fraction of the nodes explored are eliminated this way. 

Split In some special cases, it is possible to simplify a terminal system further. 
The following lemmas are inspired by our work on Petri Net Transitions [14] . 



Lemma 2. Let «i, 02 be linear expressions, and pt be a parameter not occurring 
in a{i, 2 }- Then 



- (3 ^ > 0) {a\+pLa2 = 0) = 

- (3 ^ > 0) (oi -I- p,a2 > 0) = 



(Ol = 02 = 0) V 

(oi < 0 A 02 > 0) V (oi > 0 A 02 < 0) ’ 
(oi > 0) V (02 > 0). 



These lemmas can be extended to systematically handle more complicated con- 
straints on /i. They can also be modified to apply when more than one constraint 
exists, with loss of completeness. 

Lemma 3. {Ac > 0) V {Be >0) |= (3 /i > 0) Ac + p,Bc > 0 

Instantiate Finally, instantiating some parameter p. to {0, 1} lets us resolve it. 
Other values of p, are also possible. However, using {0, 1} restricts the template 
assertion a > 0 to satisfy local or increasing-value consecution, respectively, as 
defined in Section 2. The advantage of this strategy is that it is efficient and 
simple, especially if some invariants are to be generated in as short a time as 
possible. 



4 Experimental Results 

We have implemented our method and evaluated it on several programs. Our 
prototype implementation uses the library PPL for manipulating polyhedra [2] 
supplemented with our own implementation of some of the rules in Figure 2, 
discussed below. We compared our method against forward propagation with 
two different widenings provided by PPL: the standard Ch79 widening [8] and 
BHRzOS widening [1]. The bhrz03 operator is provably more accurate, but less 
efficient than the Ch79 widening. Since we implemented the post-condition our- 
selves, we present separately the time spent computing post-conditions and the 
time spent on PPL-provided widening. 
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Experimenting with a few strategies, we converged on a strategy that scaled 
to larger examples. Some of the salient features of the strategy are the following: 

— For multi-location systems, the transitions are classified as zntra-location and 
znter-location. The constraints for the intra-location transitions at each lo- 
cation are resolved by a subset of the rules described previously. Specifically, 
factorization is performed only over equalities, and Lemmas 2 and 3 are not 
used. Handling factors over inequalities requires more polyhedral reasoning 
at every simplification while the use of the two lemmas requires sophisticated 
reasoning involving equalites, inequalities and disequalities. Our disequality 
constraint solver uses heuristic rules whose completeness remains unresolved. 

— Local and increasing consecution are used for each inter-location transition. 
This strategy can be proven exact for many situations. 

— The constraints for each location and the inter-location transitions are com- 
bined conjunctively. Converting this CNF expression to DNF is a significant 
bottleneck, requiring aggressive subsumption tests. 

~ Constraints are solved depth-first as much as possible, favouring branches 
that can be resolved faster. The collection of linear constraints from re- 
solved branches enable aggressive subsumption testing. The CNF to DNF 
conversion is also performed in a depth-first fashion to enable invariants to 
be computed eagerly. 

As an added benefit, the execution can be interrupted after a reasonable amount 
of time, and still yield many non-trivial invariants. In several cases our invariants 
are disjoint with the results of forward propagation, because propagation-based 
techniques can compute invariants ipi A ip 2 that are mutually inductive, that 
is, neither (pi nor ip 2 are inductive by themselves, while our technique only 
discovers single inequalities that are inductive by themselves. However, repeating 
the procedure with the computed invariants added to the guards of the transition 
system usually provides the stronger invariants. 

4.1 Low Dimensional Systems 

Figure 3 shows the experimental results for some small to medium sized examples 
from the related work and some benchmarks from analysis tools such as fast [3] . 
The number of variables for each program is shown in the second column. The 
table shows for each program the time (in seconds) of our (constraint-based) 
approach, and the time taken by the Ch79 and bhrz03 approach. All computa- 
tion times were measured on an Intel Xeon 1.7 GHz CPU with 2 Gb RAM. The 
last two columns show the strength of the invariants computed by our method 
compared with those computed by Ch79 and bhrz03, respectively. A -|- indi- 
cates that our invariants are strictly stronger, or no invariants were obtained by 
the other method within 1 hour. The -|-^ = indicates that our invariants were 
stronger for three locations, while they were the same for the other locations. An 
=, yf and — indicate that our invariants are equal, incomparable, and strictly 
weaker, respectively. The suffix N indicates that all the variables in the system 
were constrained to be positive to increase the number of invariants discovered 
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Program 


Constraint-based 

method 


ch79 

time (secs) 


BHRZOS 
time (secs) 


C-B Invariants 
versus 


Name 


vars 


time 


#br 


# sub 


total 


post 


widen 


total 


post 


widen 


ch79 


BHRZ03 


SEE-SAW 


2 


0.03 


13 


8 


0 


0 


0 


0 


0 


0 


+ 


+ 


ROBOT 


3 


0.02 


2 


1 


0.01 


0 


0.01 


0.01 


0.01 


0 


= 


= 


TRAIN-HPR97 


3 


0.86 


25 


5 


0.02 


0.02 


0 


0.02 


0.02 


0 


+^ = 


+^ = 


RAND 


4 


0.02 


3 


2 


0.01 


0 


0 


0 


0 


0 


= 


= 


BERKELEY 


4 


0.06 


11 


8 


0.01 


0 


0.01 


0.01 


0 


0.01 




7 ^ 


BERKELEY-N 


4 


0.04 


9 


4 


0.01 


0.01 


0 


0.01 


0 


0.01 


-f 


-f 


HEAPSORT 


5 


0.1 


21 


12 


0.02 


0.01 


0.01 


0.02 


0.02 


0 




7 ^ 


train-rmOS 


6 


1.16 


193 


99 


0.06 


0.05 


0.01 


0.07 


0.05 


0.02 


-f 


= 


EFM 


6 


0.36 


57 


23 


0 


0 


0 


0.01 


0.01 


0 


— 


— 


EFMl 


6 


0.32 


57 


27 


0 


0 


0 


0.01 


0.01 


0 


- 


- 


LIFO 


7 


0.88 


58 


51 


0.29 


0.27 


0.02 


0.32 


0.29 


0.03 


7 ^ 


7 ^ 


LIFO-N 


7 


10.13 


1191 


593 


0.27 


0.25 


0.02 


0.32 


0.27 


0.04 


-f 


-f 


CARS-MIDPT 


7 


0.1 


17 


8 


32.8 


5 


27.8 


> 3600 






-f 


-f 


BARBER 


8 


1.68 


125 


84 


0.18 


0.17 


0.01 


20.41 


0.18 


20.23 


-f 


-f 


SWIM 


9 


0.42 


36 


22 


0.08 


0.06 


0.02 


0.61 


0.06 


0.55 


- 


- 


SWIMl 


9 


0.88 


65 


32 


r- 

o 

o 


p 

O 


0.01 


0.59 


0.06 


0.53 


= 


= 



Fig. 3. Experimental results for some low-dimensional systems. #br is the number of 
branches, #sub is the number pruned by subsumption tests. 



in one run. The programs SWImI, efmI were obtained by adding the previously 
computed invariants as guards to the transition relations. 

The figure shows that for the programs tested our invariants are mostly 
superior or comparable, but at a significant extra cost in computation time for 
the smaller dimensions. However, the situation changes when the dimensionality 
of the systems is increased beyond ten variables, as shown in the next section. 



4.2 Higher-Dimensional Systems 

To evaluate our method for systems with more variables we compared its per- 
formance on instances of two parameterized systems. 

Pre-emptive Scheduler: The first system is an n process pre-emptive scheduler 
inspired by the two-process example in Halbwachs et. al. [11]. Two arrivals of 
process pi are separated by at least Ci time units, for a fixed c^. Process pi pre- 
empts process pj for j < i. The system has n locations, where location £i denotes 
that process pi is executing and that there are no waiting processes pj for j > i. 

Convoy of Cars: The second system consists of n cars on a straight road whose 
accelerations are controlled (as in real life), determining their velocity. The lead 
car non-deterministically chooses an acceleration. The controller for each car 
detects when the lead car is too close or too far, and, after a bounded reaction 
time, adjusts acceleration. Time was discretized in order to linearize the resulting 
transition system. 
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Program 


Constraint-based 

method 


ch79 

time (secs) 


BHRZOS 
time (secs) 




Name 




time 


as 


# sub 


total 






total 








BHRZ03 


























2 proc. 


7 


0.54 


23 


10 


0.15 


0.12 


liiia 




0.12 


0.07 


¥= - 


¥= - 


3 proc. 


10 


8.21 


36 


16 


39.5 


26.9 


12.8 


2232 


27.5 


2204 






4 proc. 


13 


284 


55 


26 








> 3600 










5 proc. 


16 


> 3600 




41 








> 3600 






? 


? 


CARS 
























2 proc. 


M 


3.54 


93 




5.22 


4.23 




443 


200 


243 






3 proc. 


ID 




468 










> 3600 










4 proc. 


ID 


1006 


3722 


1897 


> 3600 






> 3600 











Fig. 4. Performance Comparison on parameterized examples Scheduler and Cars. 



Figure 4 shows the performance comparison In all cases above 10 variables, 
our technique out-performs the other two techniques. The propagation-based 
techniques ran out of time for these systems. Our method ran out of time for the 
5-process scheduler. It did so while converting a large CNF formula into a DNF 
formula. In fact, two different timeouts, 700s and 3600s, yielded the same (non- 
trivial) invariants. A total of 19 disjuncts in the normal form conversion were 
found to be relevant within the first 700 seconds, while all the 75791 disjuncts 
computed in the next 2900 seconds were found to be subsumed by the original 
19. This suggests that a vast majority of disjuncts in the computed DNF form 
yield the same invariant, which was confirmed by other examples. 

5 Conclusion 

Linear programming, as a discipline has seen tremendous advances in the past 
century. Our research demonstrates that some ideas from linear programming 
can be used to provide alternative techniques for linear-relations analysis. Anal- 
ysis carried out this way has some powerful advantages. It provides the ability to 
adjust the complexity and the accuracy in numerous ways. The constraint-based 
perspective for linear relations analysis can be powerful, both in theory and in 
practice. 

Future work needs to concentrate on increasing the dimensionality and the 
complexity of the application examples for this analysis. Numerous mathemat- 
ical tools remain to be explored for this domain. The use of numerical, and 
interval-numerical techniques for handling robustness in polyhedral computa- 
tions has remained largely unexplored. Manipulation techniques for compressed 
representations along the lines of Halbwachs et al. [10] has also shown promise. 
Further investigations into the geometry of these constraints will yield a precise 
and faster analysis. 
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Abstract. Programming language technology can contribute to the de- 
velopment and understanding of Systems Biology by providing formal 
calculi for specifying and analysing the dynamic behaviour of biological 
systems. Our focus is on BioAmbients, a variation of the ambient cal- 
culi developed for modelling mobility in computer systems. We present 
a static analysis for capturing the spatial structure of biological systems 
and we illustrate it on a few examples. 



1 Introduction and Motivation 

Systems biology is an approach to studying biological phenomena that is based 
on a high level view of biological systems. The main focus is not the strueture of 
biological components but rather the dynamics of these components. This poses 
a challenge for computer science: can programming language technology be used 
to model and analyse not only the structure of biological processes but also their 
evolution? 

Pioneering work by Shapiro et al [15] demonstrated how biological processes 
could be specified in the 7r-calculus [7]; the formalism showed its strength at 
the molecular and biochemical level but it was less successful at the higher 
abstraction levels where compartments play a central role. Here, on the other 
hand, a version of the Ambient Calculus [3], called BioAmbients [13,14], shows 
promise as the hierarchical structure of the ambients is very similar to that of 
compartments; the main difference between the two calculi is in the choice of 
the primitives for modelling the interaction between ambients or compartments. 
Surely biological systems are very complex and the scope for developing calculi 
of computation that capture various aspects of these systems is endless; recent 
work includes [2, 5, 8, 12]. 

The main goal, of course, is to capture the behaviour of biological systems 
in a faithful manner. Over the years biologists have collected observations about 
biological systems in large databases and it is important to investigate to what 
extent our models can explain these data in a satisfactory way. It turns out that 
many of the observations collected by the biologists concern spatial properties 
(as opposed to temporal properties) and this is where static analysis - and the 
present paper - gets into the picture. 

* This research has been supported by the LoST project (number 21-02-0507) funded 
the Danish Natural Science Research Council. 
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Overview of the Paper. In Section 2 we present the syntax and semantics of 
BioAmbients. The spatial analysis is developed in two stages: First a compati- 
bility analysis is developed in Section 3; it computes an over-approximation of 
the possible interactions within the system of interest. This information is then 
used in the spatial analysis presented in Section 4; this analysis contains a novel 
treatment of recursion and a new technique for reducing the space complexity of 
the analysis. Finally, Section 5 illustrates our approach on a few examples and 
contains our concluding remarks. 



2 BioAmbients 

BioAmbients [13, 14] differ from Mobile Ambients [3] and its siblings Safe Am- 
bients [6], Boxed Ambients [1] and Discretionary Ambients [11] in a number of 
ways. The most important difference is that the names (or identities) of the 
ambients do not control the interaction between ambients, but rather names 
(of channels) serve that purpose. BioAmbients follow the approach of safe and 
discretionary ambients and specify interactions by matching capabilities and 
co-capabilities; the communication primitives have some reminiscents of boxed 
ambients in that communication can occur across ambient boundaries but it is 
based on channels as in the 7r-calculus. 

BioAmbients deviate from the other ambient calculi in having a non-deter- 
ministic choice operation in addition to the construct for parallelism (just as the 
TT-calculus [7]). The pioneering development presented in [13,14] observes the 
need to use a general recursion construct in order to faithfully model biological 
systems but the theoretical development is only performed for the classical repli- 
cation construct. To be able to analyse such examples we shall therefore study a 
version of BioAmbients with a general recursion operator and, as we shall see in 
later sections, this poses some interesting technical challenges for the theoretical 
properties of the analysis. 

The syntax of BioAmbients is given in Table 1; here we write P for processes 
and M for capabilities. Each ambient has an identity pL G Ambient and each 
capability has a label i G Lab; these annotations have no semantic significance 
but are useful as “pointers” into the process and also serve a role in the analysis. 
(We shall not require that identities or labels are unique.) Furthermore, each 
name has a canonical name [nj G Name and we shall demand that alpha- 
renaming preserves the canonical name; consequently it will be the canonical 
name rather than the name that will be recorded in the analysis. For the sake of 
simplicity, we shall assume that a subset C C Name of the canonical names is 
reserved for constants and below we shall require that names introduced by {n)P 
satisfy [nJ G C. The capabilities M are based on names and hence we shall write 
[MJ G Cap for the corresponding canonical capability obtained by replacing the 
names with the corresponding canonical names. The input capabilities (n?{p}, 
etc.) introduce new names p acting as placeholders (or variables); below we 
shall require that [pj G V where V = Name \ C. Finally, processes may be 
recursively defined using the construct rec X. P and to simplify the development 
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Table 1. Syntax of processes P and capabilities M. 



P 


0 




inactive process 


1 


(n)P 




binding box for the name n 


1 


[pr 




ambient P with the identity p. 


1 


M^.P 




prefixing with the capability M labelled £ 


1 


P 1 P' 




parallel processes 


1 


P + P' 




non-deterministic (external) choice 


1 


recA.P 




recursive process {X — P) 


1 


A 




process identifier 


M ::= 


enter n 


1 accept n 


enter movement 


1 


exit n 


1 expel n 


exit movement 


1 


merge- n 


1 merge+ n 


merge movement 


1 


n!{m} 


nl{p} 


local communication 


1 


n_!{m} 


1 nP{p} 


communication to child 


1 


n"!{m} 


1 nP{p} 


communication to parent 


1 


n#!{m} 


n#?{p} 


communication between siblings 



we shall require that X indeed occurs inside P; obviously the usual replication 
operation ! P can be obtained as rec Ai. (P | AT) (assuming X does not occur free 
in P). Analogously to the treatment of names we shall require that each process 
identifier X has a canonical identity [XJ that is preserved by alpha-renaming. 

Programs will be processes P* satisfying the predicate PRGc(P*) defined as 
the conjunction of the following conditions (explained below) : 

— P* has no free process identifiers; formally fpi(P*) = 0. 

— P* only has free names from C; formally [fn(P*)J C C. 

— P* is we 11- formed wrt. C; formally C h P*. 

Here we write fpi(P) for the set of free process identifiers of P and fn(P) for the 
free names of P; the canonicity operation [-J is extended in a pointwise manner 
to sets of names. The well-formedness predicate C h P serves two purposes: first, 
it enforces the implicit typing requirements imposed by the division of Name 
into the two disjoint subsets C and V; secondly, it imposes the condition that 
process identifiers are actually used recursively in the processes they define. The 
predicate is formally defined in Table 2 and uses bn(M) to denote the bound 
names of the capability M . Note that the condition will reject processes like 
(n) [enter | m?{n}^^. enter where the same name n is introduced as a 

constant and also introduced in an input capability; a simple alpha-renaming 
will, of course, solve the problem. 

We shall write P[m/n] for the process that is as P except that all free oc- 
currences of the name n are replaced by the name m. Similarly, we shall write 
P[Q/X] for the process that is as P except that all free occurrences of the process 
identifier X are replaced by the process Q. In both cases we take care to per- 
form the necessary alpha-renamings (preserving canonicity) to avoid capturing 
free names or process identifiers. 
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Table 2. Well-formedness of processes with respect to C: C h P. 



C h P 


ChP 


C 1- 0 c 1- (n)P 

if [nj G C 

ChP ChP' ChP ChP' 




\- [pie C b Tf .P 

C b IP| if [bn(M)J n C = 0 

ChP 


ChrecA.P ChX 

if X G fpi(P) 


CI-P|P' ChP + P' 


Table 3. Structural congruence: P = Q 


is the least congruence defined by the above. 


Alpha-renaming of bound names and bound process identifiers: 


P = Q if P may be alpha-renamed to Q (preserving canonicity) 


Reordering of parallel processes: 


Reordering of choice processes: 


P 1 P' = P' 1 P 


P + P' = P' + P 


(P 1 P') 1 P" = P 1 (P' 1 P") 


(P -h P') -h P" = P -f (P' -h P") 


P 1 0 = P 


P + 0 = P 


Scope rules for name bindings: 




0 

III 

0 


(n)(P 1 P') = ((n)P) 1 P' if n ^ fn(P') 


(n)(m)P= (m)(n)P 


(n)(P -h P') = ((n)P) + P' if n ^ fn(P') 


(n)([P]'‘) = [(n)Pr 




Recursion: 




recX.P = P[recX.P/X] 





Example 1. To illustrate the development we consider the following program 
-Fvirus; as we shall see shortly it models how the gene of a virus may infect a cell: 

[ recX. enter . X + exit X + E7{xY^. expel x^'^. X 
I [exit ri 3 =. 0]9ene-^virus 

I [ recT. accept . Y -|- expel n^’’. Y -|- c-ljna}^®. 

It is trivial to check that the well-formedness condition is fulfilled for C = 

{[cj, L«iJ: □ 

Semantics. The semantics follows the standard approach and is specified by 
the structural congruence relation P = Q in Table 3 and the transition relation 
P Q in Table 4. The congruence relation uses a disciplined notion of alpha- 
renaming that preserves canonicity. The movement interactions merely give rise 
to a rearrangement of the ambient structure where some potential continuations 
are excluded (due to the presence of the non-deterministic choice operation) . The 
communication interactions also exclude some potential continuations but they 
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Table 4. Transition relation: P —> Q. 



Movement of ambients: 



[(enter T.P + P') \ PT 


1 [(accept n^^.Q-\- Q') 


1 Q'T" - 


T[P 


1 


IQI 


Q"\ 


[[(exit T.P + P') 1 P"[ 


(expel n^^.Q + Q') 


1 Q"T^ - 


-[Pl 


1 


[Ql 


Q"\ 


[(merge- . P + P') \ | 


[(merge-l- n^^.Q-\- Q') 


1 Q"T^ - 


-[Pl 


P"\Q 


IQ" 


j/^2 



Communication between ambients: 

{n\{mY^ ,P + P')\ {n?{pY^ ,Q + Q')^P\ Q[m/p] 

. P + P') I [{nUpV^ -Q + Q’)\ Q'T ^ P I [Q[m/p] \ Q'T 
[(n . P + P') I P'T I (n.?{py^ .Q + Q')^[P\ P'T I Q[m/p] 

[(n#!{m}^i . P + P') I P'T I [{n#nP}‘^ -Q + Q')\ Q'T ^ [P I P'T I [QTp] I Q'T 

Execution in context: 

P^Q P^Q P^Q P = P' P' ^Q' Q' = Q 

(n)P ^ {n)Q [P]'' ^ [Q]'* P|P^Q|P P^Q 



do not modify the overall ambient structure; however, some of the processes are 
modified in order to reflect the new binding of names. The semantics of recursion 
amounts to a straightforward unfolding in the congruence relation; this is more 
general than the overly restrictive semantics used in [9]. 

Example 2. The semantics of the program Pyirus of Example 1 is illustrated on 
Figure 1. The initial configuration is shown in the upper leftmost frame where 
the tree structure reflects that cell and virus are siblings (with a common father 
denoted T) and gene is a subambient of virus. The first step of the semantics will 
be for virus to move into cell using the pair (enter accept n^®) of capabilities 
and we obtain the configuration depicted in the bottom leftmost frame of the 
figure. Now there are two possibilities: either virus moves out of cell using the pair 
(exit 77,2^, expel n^) of capabilities and we are back in the initial configuration 
at the top or, alternatively, there is a communication from cell to virus over 
the name c using the pair (c-ljna}^®, c~?{a;}^®) of capabilities during which x 
is bound to as indicated in the corresponding frame of the figure. The pair 
(exit rig®, expel ng'*) of capabilities will now move gene out of virus and we reach 
a configuration where virus can exit and enter cell any number of times or the 
communication over c may happen again after which the system ends in a stuck 
configuration (shown in the top rightmost frame of the figure) . □ 

3 Compatibility Analysis 

The aim of the spatial analysis is to extract an over-approximation of the pos- 
sible hierarchial structures of the ambients. For this we need to approximate 
the potential interactions between the ambients and motivated by [4] we shall 
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Fig. 1. Illustration of the semantics of the running example. 



develop a compatibility analysis. Given a process P, the aim of the compatibility 
analysis is to identify pairs of labelled capabilities that, from a syntactic point of 
view, may engage in a transition. Intuitively, this means that the two capabilities 
must match and that it must be possible for them to occur in parallel processes. 
As an example, in [enter \ [accept the capabilities labelled £i and £2 

may interact because from a syntactic point of view we cannot preclude that n 
and m may turn out to be equal; however, if we replace the parallel composition 
with a non-deterministic choice then they will never be able to interact. 

The matching condition will ignore the actual names occurring in the capa- 
bilities (because even the canonical names are not preserved under reduction) 
and to formalise this we shall introduce the notion of a skeleton capability : \M~\ 
is simply obtained from M by replacing all names in M with the token “ • ” . 
The matching condition on skeleton capabilities can now be expressed by the 
predicate 

match([Mi], [M 2 ]) 

that holds if and only if 

([Ml], [M 2 ]) G { (enter •, accept •), (exit •, expel •), (merge- •, merge+ •), 

(•?{•},•!{•}), (•'?{•},•-!{•}), (•-?{•}, G{.}), (.#?{.},.#!{.})} 

In order to define the compatibility information we shall first need to extract 
the set of labelled skeleton capabilities occurring within a process. This is done 
using the function caps/^(P) of Table 5; here T is a mapping that to each process 
identifier associates a set of labelled skeleton capabilities. The mapping P is 
useful later (in the definition of comp) when we encounter subprocesses with 
free process identifiers. 

The compatibility information is then obtained using the function compi^^(P) 
of Table 5. Here P is as above whereas zi is a mapping that to each process 
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Table 5. Capabilities, capSp(P), and compatible pairs of capabilities, comp^^(P). 



capsj,(0) = 0 
caps^([P]'') = caps^(P) 
caps^(P I P') = capSj,(P) U capSp(P') 
capsp(rec X. P) = capSj,[_,(-^ 0 ] iP) 

comp^^(O) = 0 
comp^^([P]'') = comp^^(P) 
comp^^(P I P') = comp^^(P) 

U comp^^(P') 

U crossr(P, P') 

comp^^(recA.P) = comp^,^[_,f„ 0 ] (P) 
where P' = P[X capSj^[_,j.^ 0 j (P)] 



capSp{{n)P) = capSjn(P) 
capSp(M^.P) = { [M]^} U capSj,(P) 
capSj,(P + P') = capSj,(P) U caps^(P') 
capSp(X) = P(A) 

comp^^((n)P) = comp^^(P) 
comp^^ (M^.P) = compj,^(P) 
comp^^(P + P') = comp^^(P) 

U comp^^(P') 

comp^^(X) = A{X) 



crossr(P,P') = € (caps^(P) x capSj,(P')) U {capSp{P')xcapSp{P)) 

I match([Mi], rM 2 ])} 



identifier associates a set of pairs of labelled skeleton capabilities; again we are 
parametric on P and A so that we can handle processes with free process identi- 
fiers. In the case of parallel composition the definition of comp uses the auxiliary 
operation cross to record that capabilities in the two branches may interact with 
one another and the caps function is used in order to specify this. This is in con- 
trast to the definition provided for non-deterministic choice where it is known 
that capabilities from the two branches never will interact. 

Example 3. For the running example Pvirus of Examples 1 and 2 we get: 

CPvirus = { (enter accept -^®), 

(exit expel (exit expel (exit expel 

(._!{.}^a,.7{.}P)} 

Comparing with Figure 1 we see that this is indeed an over-approximation of 
the actual interactions that can take place: the pair (exit expel G CPvirus 
has no analogue in Figure 1. □ 

Example 4- Consider the artificial variant Pjjrus process of Example 1 

where the virus exists in two variants, one with a gene much as before and one 
with a harmless gene: 

[rec Ai. enter X + exit X + c'?{a:}^^. expel X 
I ( [exit 0 -f accept -b [enter ni^ 0 

I [recF. accept Y + expel . Y + c_!{ri 3 }^'^“. Y 

The compatibility analysis will compute the following information: 

CPvirus = { (enter accept -^®), (enter -^E accept -^®), (enter accept -^®), 
(exit --^E expel -^s), (exit -^E expel (exit -^E expel 

(..!{.}Eo,.~7{.}^3)} 
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Note that despite the over-approximation this correctly captures that for ex- 
ample the capabilities labelled £7 and £5 of the two genes never will be able to 
interact. □ 

The correctness of the compatibility analysis follows from: 

Lemma 1. If P = Q and C h P then compj.^(P) = compj^^((5). If P ^ Q 
and C \- P then compji^(Q) C compj^^(P). 

In the subsequent analyses we shall make use of the compatibility relation 
for the overall program P* of interest. Writing [] for the empty mapping we 
shall use the abbreviation CP* for compj][](P*) thereby exploiting that P* has 
no free process identifiers. Thus it follows from Lemma 1 that if PRGc(P*) and 
P* Q then compjj[]((5) C CP* so CP* remains a correct over-approximation. 

4 Spatial Analysis 

We are now ready to embark on the spatial analysis: for a program P* we want 
to approximate what ambients may turn up inside what other ambients. To 
extract this information we shall develop an analysis extracting the following 
information: 

— An approximation of the contents of ambients: 

X C Ambient x (Ambient U (Cap x Lab)) 

Here /i' G T(/i) means that fj,' may be a subambient of the ambient fj, and 
G X{fX) means that the labelled canonical capability [M\^ may be 
within the ambient 

— An approximation of the relevant name bindings: 

P C V X C (C Name x Name) 

Here v' G P{v) means that the constant (canonical) name v' may be bound 
to the variable (canonical) name v. 

The judgements of the analysis take the form 

(p,p) KP 

and express that when the subprocess P (of P*) is enclosed within an ambient 
with the identity fjL G Ambient then I and P correctly capture the behaviour of 
P - meaning that X will reflect the contents of the ambients as P evolves inside 
P* and P will contain all the bindings of names that take place. The analysis 
is specified in Table 6 and refers to Table 7 for auxiliary information about the 
recursion construct and to Table 8 for a specification of the closure conditions 
closure I'M] . Below we comment on the clauses. 
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Table 6. Analysis specification: (X, TV) |=^ P. 



(X.P) 


N" 


0 


iff 


true 








(X.P) 




{n)P 


iff 


{I,TZ) KP 








{I,TZ) 


l=" 


[Pf 


iff 


g' G I (g) A (I, 


77) 


' P 




{T,Tl) 




m‘.p 


iff 


[MJ‘ eI{g)A 


(X,77) 




P A closure 1 -MI 


iI,TZ) 


N" 


P 1 P' 


iff 


{I, 77) P A 


(X,77) 


N" 


P' 


iXTZ) 


N" 


P + P' 


iff 


(X, 77) P A 


(X,77) 




P' 




N" 


rec X. P 


iff 


V/i' : g' G C)x 


(G''(recX. 


P))^(X,77) K'-P 


(X.77) 




X 


iff 


true 









Table 7. Auxiliary analysis information for recursion: G^{P). 


G^(0) = 0 


G\{n)P) = G'*(P) 


G^([P]'') = G'^(P) 


G\M^.P) = G\P) 


G«(p 1 p') ^ G\P)uG\P') 


G\P + P') = G\P)yjG\P') 


G^ (rec X. P) = G (P) U { [XJ ^ 5} 


G\X) = {[X\ ^ 5} 



Table 6 specifies a simple syntax directed traversal of the process with the 
clauses for ambients and capabilities being two of the more interesting ones as 
they check that X contains the correct initial information. The clause for {n)P is 
very simple since n is a constant (in contrast to a variable); in particular there is 
no need to impose any requirements on TZ. The clauses for the parallel and the 
choice constructs look exactly the same; however, the use of the compatibility 
information in the closure conditions of Table 8 ensures that they are indeed 
handled differently. 

The clause for recursion ensures that the analysis result is valid in all the 
contexts in which the recursion construct rec X. P may be encountered including 
those arising from its unfolding. These contexts are provided by the auxiliary op- 
eration G^{P) (see Table 7) that constructs a simple regular grammar for the po- 
tential contexts of the process identifiers. The non-terminals of the grammar are 
the canonical process identifiers, the terminal symbols are the ambient identities 
and the right hand side of the productions will contain exactly one (non-terminal 
or terminal) symbol. The language generated by the grammar G'^(rec X. P) when 
[XJ is the start symbol is written (G'^(rec X. P)) and it approximates 
the contexts in which the recursion construct may be encountered. This lan- 
guage is clearly finite. As an example, for the process [recX. recY. (X | 
we obtain a grammar with the productions {[XJ ^ ^i, [YJ ^ [XJ, [XJ — > 
[YJ, [YJ — > /T 2 }. The language generated by this grammar by the non-terminal 
[XJ is reflecting that the outermost recursion may occur in both con- 

texts as can be seen by unfolding both X and Y once. 

Turning to the closure conditions of Table 8 we first observe that there are 
two clauses for each matching pair of skeleton capabilities and one of these is 
trivial. In each case the pre-condition of the non-trivial clause checks whether 
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Table 8. Closure condition on X and 77.. 



closu reenter 


= V/r, /ri, /T 2 , r' 2 , ti, £2 : 

enter G X{/ii) A fii & T(/r) A accept G X(fJ-2) A /i2 6 X{/i)A 

{TZ){i'i) n {TZ){u2) 0 a CP*(enter , accept 

fii e T(^ 2 ) 


CIOSU reaccept ■ 


= true 


closu reexit ■ 


= y fl, fj. 1 , ^2,121, '■ 

exit G X{^i) A G I{fJ.2) A expel G I{^J.2) A /i2 G T(/r)A 

{TZ){ui) n {TZ){u2) / 0 a CP*(exit expel 

fj,i e i{n) 


closu reexpel . 


= true 


closu remerge- ■ 


= fl, fll, ^2,121, 1^2, il, (-2 ■ 

merge- G X(/ii) A G X(/i) A merge-|- i>2^ G T(/i 2 ) A fj.2 E X{p)A 

{TZ){ui) n {R){u2) / 0 a CP*(merge- merge+ 

=> X(^i)CX(^ 2 ) 


c:losure„,^,g^+ . 


= true 


closure. 


= 1^1, 1^2, il,h ■■ 

Ui\{umY^ AU2?{upY'^ GX(/r)A 

{TZ){u,) n {TZ)(u2) / 0 A CP*(-!{.}‘^ , 
=> ( 77 )(i/m) C 77 (t'p) 


closure. ?{.} 


= true 


closure. 


= yfJ,,fJ,c,l 2 m,l'p,l 2 l,l' 2 ,il ,£2 ■ 

e 2 {fJ.) A U 2 ''?{upY^ G X(/Tc) A fj.c E T(^)A 

{'R){u,) n {n){u2) / 0 A CP*(-J{-}'b ■"?{■}'=) 

=> ( 77 )(i/m) C 77 (t'p) 


closure. -7{.} 


= true 


closure. -]{.} 


= ^ll, flc, 12 m, I^P,U 1 , 122, il, £2 ■■ 

12 l''\{l 2 mY^ E X(/Tc) A flc E X{n) A 122 -T{l 2 pY^ E X{fJ.)A 

{K){v,) n {n){u2) / 0 A CP*(--!{.}'b ■-?{■}'=) 

^ {'R-){Vm) ETZ{l 2 p) 


closure. _?{.). 


= true 


closure j 


= V/i, fJ.1 , ^2 , 12 m , I 2 p , 12 l , 122 , £1 , £2 '■ 

i 2 i#Yi^rnY^ £ T(/ri) A E 1(12) A i 22 #T{i 2 pY^ E 1(122) A ^22 E X(/r)A 

( 77 )(i 2 i) n {n){u2) / 0 A CP*(-#!{-}'^ , -#H-YY 

=> (77) (m) C TZ(i2p) 


closure j 


= true 



an abstract version of the firing conditions of the corresponding transition rule 
is fulfilled and the conclusion then records an abstract version of the resulting 
configuration. The T relation is used to check the spatial conditions, the 77 re- 
lation is used to check the potential agreement of names, and the compatibility 
information of CP* is used to check whether the current pairs of canonical ca- 
pabilities may interact at all. Since the relation 77 is only concerned with the 
names that act as variables we shall use a slightly modified version of 77 namely 
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Table 9. Three BioAmbient example processes from [14]. 



Membranal pore : 

[recAi. enter cdll^ Ai + exit cell^^ . 

I [recX2. enter cell^^ . X2 + exit celf^- X2 
I [recXa. accept cell{^ . X3 + expel cell^ . X3 

Single substrate enzymatic reaction : 

[recX. accept esbindf'''^ . (expel unbind^^ . X + expel react^^ . X) 

+accept epbind^*. (expel unbind^^ . X + expel react^^ . 

I [recXi. enter esbind^’' . recX2. (exit unbind^^ . Xi 

+exit react^^. enter epbind^^° . X2)]™°* 



Two-protein complex : 

[recXi. merge- cplx^''^ . brkl{uY^- expel brk^^ . Xi]’^°‘^ 

I [(bfe)recX2. merge+ cplx^'^. hrk\{dY^ . bb\{d}^^ . [merge+ 66^’’. exit brk^^ . X2]™'°*'^ 
I recXg. bb?{vY^. [merge- bb‘^« . X3]"*°'= 



(7^) C (V U C) X C 

that takes care of variables as well as constants; it is defined by: 

(Vn : n € C ^ n)) A (Vn, m : TZ{n, m) ^ (72.)(n, m)) 

The analysis result for the program P* is then the minimal I and TZ such that 
(X, P) 1=^ P* where T is the identity of an artificial top-level ambient. 

Example 5. The analysis of the running example Pvirus gives rise to the following 
minimal X and TZ: 





X{p) 




n 


TZ{n) 


cell 


gene, virus, expel accept 




X 


ns 


gene 


exit Ug® 






virus 


gene, expel , c'l {xY^ , ex\t n^^, enter 






T 


gene, cell, virus 







Figure 2 (a) gives a graphical representation of the ambient part of the relation 
X. There is one node for each of the ambient identities and an edge from the node 
representing /ii to the one representing pt 2 if and only if (^ 1 , ^ 2 ) G X. The edge is 
solid if (/ii, ^ 2 ) is introduced into X by the initialisation rules of Table 6 and it is 
dotted if it is introduced by the closure conditions of Table 8. Note that the trees 
of the individual frames of Figure 1 are all subgraphs of this figure (as should 
be expected from the semantic correctness result to be presented below). The 
example also shows that the analysis is indeed an over-approximation: although 
it is reported that the gene may occur at the top-level, it will never happen. □ 
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Fig. 2. Spatial analysis of the running examples Pvims (a) and PvVus ((b) and (c)). 



Example 6. To illustrate the importance of the comp relation consider the arti- 
ficial variant of the virus process of Example 4. Figure 2 (b) gives a graphical 
representation of the T component of the analysis result and as expected we 
observe that the harmless gene does not change its position within the ambient 
hierarchy. 

If we were to remove the tests on the compatibility relation in the closure 
condition of Table 8 then we would obtain a more imprecise result as illustrated 
on Figure 2 (c): it now seems that one of the genes may move into the other. 
The reason for this is, of course, that without the compatibility test the analysis 
does not observe that the two genes will never be present at the same time. □ 

Turning to the correctness of the analysis we shall state that the analysis 
result is invariant under the structural congruence: 

Lemma 2. If P = Q and C h P then (X, TZ) P if and only if (X, TZ) Q. 

To express the correctness of the analysis result under reduction we shall 
first introduce a new operation that expands the X component of the analysis to 
take the bindings of the variables into account as specified by the TZ component. 
Thus if enter G T{p) then v may be the canonical name of a variable and we 
shall construct the relation X@TZ such that enter G l@TZ{p) for all possible 
constants v' that can be bound to ly, that is, for all v' G {TZ){v). More generally, 
we define X@TZ as follows: 

If G X(/r), V G fn([Mj) and v' G {TZ){u) then lM\^[v' /v] G X@TZ{p). 

We can now express that the analysis result is preserved under reduction in 
the following sense: 

Lemma 3. Assume PRGc(P) and compj j(P) C CP*; if furthermore P ^ Q 
and (X, P) 1=^ P then (X@TZ,TZ) Q. 
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It is immediate to show that X@72. = (X@7?.)@72. and hence we can state the 
overall correctness result as follows: 

Theorem 1. //PRGc(-P*), (X,7^) P* and P* Q then (X@P,P) <3- 

5 Concluding Remarks 

We have presented a spatial analysis for a version of BioAmbients with a general 
recursion construct that allows us to express mutual recursion as seems to be 
required in order to model biological systems. The analysis has been implemented 
using the Succinct Solver [10] and has subsequently been applied to a number 
of examples including three small examples from [13, 14] presented below. We 
conclude with a comparison with related work - indicating those techniques that 
are new to this paper. 

Three Examples. The first example of Table 9 is a membranal pore allowing 
molecules to pass through a membrane. The example is specialised to the case 
of a single cell and two molecules and when executed the two membranes may 
enter and leave the cell any number of times and independently of one another. 
This is clearly captured by the analysis result of Figure 3. Also the analysis tells 
us that the cell will never enter one of the molecules and that the molecules will 
never enter one another; while this may be easy to see for a small example it 
may not be so obvious for a larger system. 

The second example of Table 9 models a single-substrate enzymatic process 
and compared with the previous example its control structure is more complex in 
that it uses a double recursion and a number of names to control the interaction 
between the ambients. The analysis result depicted in Figure 3 exhibits the 
underlying spatial structure. 

The final example of Table 9 models the formation and breakage of a two- 
protein complex. Initially the system consists of two molecules and the complex 
is formed by the merge operation. The breakage is initiated by a communication 
followed by a communication over a private name and finally the complex is sep- 
arated into two molecules with the same structure as in the initial configuration. 
The rather complex control structure is reflected in the analysis result presented 
in Figure 3 showing that both molecules can be inside one another and that they 
both have the ability to reconstruct themselves. 

Comparison with Related Work. The work presented in this paper is one of the 
first static analyses of calculi for modelling biological systems; to the best of 
our knowledge, the only proceeding work is that of [9] and the present work 
comprises a number of improvements and novelties. 

One important difference is the way names are handled. In [9] we follow the 
traditional approach of control flow analysis and use an environment TZ that 
corresponds more closely to the auxiliary environment {TV) used here. Hence, 
in [9] we make an entry into TZ whenever a name is introduced (and in the 
case of a constant it is mapped to itself) and when we make an entry with a 
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Membranal pore Enzyme-substrate complex Two-protein complex 



Fig. 3. Spatial analysis of examples in Table 9. 



free name into X we make sure to make entries corresponding to all bindings 
of the free name as recorded in the environment (i.e. 72,). While this leads to 
a rather natural formulation of the clauses and straighforward formulations of 
the semantic correctness result, the relations become overly large. Hence in the 
interest of obtaining more manageable implementations we have chosen not to 
add constants into environments and only to make representative entries into X 
that are then expanded “on the fly” during look-up. Essentially we are trading 
space for time which generally is a good strategy when using the Succinct Solver. 
To formulate the semantic correctness of the analysis we therefore need to make 
a similar expansion and this is achieved using X@TZ. 

Another important difference is our treatment of recursion which is techni- 
cally much more complex than the traditional treatment of replication (as in 
!P). The treatment of recursion in [9] was unsatisfactory in that the unfolding 
of the recursion construct was part of the transition relation rather than the 
congruence as in the present paper, and hence [9] misses some of the interac- 
tions correctly captured here. (To the best of our knowledge the analysis in [9] 
is correct with respect to the semantics.) For a correct treatment of this general 
way of unfolding recursion we have had to ensure that the body of the recursion 
is analysed in all contexts that may arise dynamically. While this may sound 
like just another component that could be added to the analysis (e.g. tracking 
occurrences of process identifiers in X) it actually turns out to be important 
not to include this information into the analysis in order for the analysis to be 
semantically correct. Hence we have defined an operation for constructing a 
simple regular grammar deriving the possible contexts; it is essential for seman- 
tic correctness of the analysis that this information is not stored in components 
like X and 72 but rather computed “on the fly” . This technique is likely to be 
useful for other calculi also outside the realm of biological systems. 

Acknowledgements 

The authors would like to thank Corrado Priami and Debora Schuch da Rosa 
for fruitful discussions. 



Spatial Analysis of BioAmbients 



83 



References 

1. M. Bugliesi, G. Castagna, and S. Crafa. Boxed Ambients. In Theoretical Aspects 
in Computer Science (TAGS 2001), volume 2215 of Lecture Notes in Computer 
Science, pages 37-63. Springer, 2001. 

2. L. Cardelli. Brane calculi. 2003. Available from http://www.luca.demon.co.uk. 

3. L. Cardelli and A. D. Gordon. Mobile Ambients. In Foundations of Software Sci- 
ence and Computation Structures (FoSSaCS 1998), volume 1378 of Lecture Notes 
in Computer Science, pages 140-155. Springer, 1998. 

4. C.Bodei, P.Degano, C. Priami, and N. Zannone. An enhanced cfa for security poli- 
cies. In Proceedings of the Workshop on Issues on the Theory of Security (WITS’03) 
(co-located with ETAPS’03), 2003. 

5. V. Danos and C. Laneve. Core formal molecular biology. In European Symposium 
on Programming (ESOP03), volume 2618. Springer Lecture Notes in Computer 
Science, 2004. 

6. F. Levi and D. Sangiorgi. Controlling interference in ambients. In Proceedings of 
the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan- 
guages (POPL 2000), pages 352-364. ACM Press, 2000. 

7. R. Milner. Communicating and Mobile Systems: The pi-Calculus. Cambridge Uni- 
versity Press, 1999. 

8. M. Nagasaki, S. Onami, S. Miyano, and Kitano H. Bio-calculus: Its concept and 
molecular interaction. Genome Informatics, 10:133-143, 1999. 

9. F. Nielson, H. Riis Nielson, C. Priami, and D. Schuch da Rosa. Control Flow Anal- 
ysis for BioAmbients. Proceedings of BioConcur, to appear in ENTCS, 2004. 

10. F. Nielson, H. Riis Nielson, and H. Seidl. A succinct solver for ALFP. Nordic 
Journal of Computing, 9:335-372, 2002. 

11. Hanne Riis Nielson, Flemming Nielson, and Mikael Buchholtz. Security for Mo- 
bility. In Foundations of Security Analysis and Design II, volume 2946. Springer 
Lecture Notes in Computer Science, 2004. 

12. C. Priami, A. Regev, W. Silverman, and E. Shapiro. Application of a stochastic 
passing-name calculus to representation and simulation of molecular processes. 
Information Processing Letters, 80:25-31, 2001. 

13. A. Regev. Computational system biology: A calculus for biomolecular knowledge. 
PhD thesis, Tel Aviv University, 2003. 

14. A. Regev, E. M. Panina, W. Silverman, L. Cardelli, and E. Shapiro. BioAmbi- 
ents: An abstraction for biological compartments. Theoretical Computer Science, 
to appear, 2004. 

15. A. Regev, W. Silverman, and E. Shapiro. Representation and simulation of bio- 
chemical processes using the yr-calculus process algebra. In Pacific Symposium of 
Biocomputing (PSB2001), pages 459-470, 2001. 



Modular and Constraint-Based Information 
Flow Inference for an Object-Oriented Language 



Qi Sun^’*, Anindya Banerjee^’**, and David A. Naumann^’*** 



^ Stevens Institute of Technology, USA 
{sunq,nanmann}(§cs . stevens-tech.edu 
^ Kansas State University, USA 
ab@cis.ksu.edu 



Abstract. This paper addresses the problem of checking programs writ- 
ten in an object-oriented language to ensure that they satisfy the in- 
formation flow policies, confidentiality and integrity. Policy is specified 
using security types. An algorithm that infers such security types in a 
modular manner is presented. The specification of the algorithm involves 
inference for libraries. Library classes and methods maybe parameterized 
by security levels. It is shown how modular inference is achieved in the 
presence of method inheritance and override. Soundness and complete- 
ness theorems for the inference algorithm are given. 



1 Introduction 

This paper addresses the problem of checking programs to ensure that they sat- 
isfy the information flow policies, confidentiality and integrity. Confidentiality, 
for example, is an important requirement in several security applications - by 
itself, or as a component of other security policies (e.g., authentication), or as 
a desirable property to enforce in security protocols [1] . In the last decade, im- 
pressive advances have been made in specifying static analyses for confidentiality 
for a variety of languages [14]. Information flow policy is expressed by labeling 
of input and output channels with levels, e.g., low, or public, (L) and high, or 
secret, (H) in a security lattice {L < H). Many of these analyses are given in 
the style of a security type system that is shown to enforce a noninterference 
property [6]: a we 11- typed program does not leak secrets. 

Previous work of Banerjee and Naumann provides a security type system 
and noninterference result for a class based object-oriented language with fea- 
tures including method inheritance/overriding, dynamic binding, dynamically 
allocated mutable objects, type casts and recursive types [3]. It is shown how 
several object-oriented features can be exploited as covert channels to leak se- 
crets. Type checking in Banerjee and Naumann’s security type system requires 
manually annotating all fields, method parameters and method signatures with 
security types. 
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The primary focus of this paper is the automatic inference of security type 
annotations of well- typed programs. In this paper, we are not interested in full 
type inference, and assume that a well- typed program is given. There are several 
issues to confront. First, we demand inference of some, possibly all, security 
levels of fields in a class. This means that security types of fields will involve 
level variables and the same is true for method types where level variables will 
appear in types of method parameters and in the result type. 

The second issue, a critical challenge for scalability, is achieving modular 
security type inference for class-based languages. A non-modular, whole-program 
inference, say, for the language in [3], would perform inference in the context of 
the entire class table; if method m in class A is called in the body of method n 
declared in class B, then the analysis of B.n would also involve the analysis of 
A.m. Moreover, every use of A.m in a method body would necessitate its analysis. 
Our insistence on modular inference led us to the following choices: code is split 
into library classes (for which inference has already been performed) and the 
current analysis unit (for which inference is currently taking place). Inference 
naturally produces polymorphic types; so it seemed appropriate to go beyond 
previous work [3] and make libraries polymorphic. To track information flow, 
e.g., via field updates, constraints on level variables are imposed in the method 
signature; thus library method signatures appear as constrained polymorphic 
types. To avoid undecidability of inference due to polymorphic recursion [8,7], 
mutually recursive classes and methods in the current analysis unit are analyzed 
monomorphically ^ . 

Because we are analyzing an object-oriented language, the third issue we con- 
front is achieving modular inference in the presence of method inheritance and 
override. The current analysis unit can contain subclasses of a library class with 
some library methods overridden or inherited. To achieve modularity, we require 
that the signature of a library method is invariant with respect to subclassing. 
Getting the technical details correct is a formidable challenge and one which we 
have met. We provide some intuition on the problem presently. 

The research reported in this paper is being carried out in the context of a 
tool, currently under development, that handles the above issues. The tool helps 
the programmer design a library interactively by inferring the signatures of new 
classes, together with a constraint set showing the constraints that level variables 
in the signature must obey. The security types of the new classes are inferred in 
the context of the existing library. The new code may inherit library methods 
- this causes the polymorphic signature of a library method to be instantiated 
at every use of the method, and the instantiated constraints will apply to the 
current context. 

Handling method override is more subtle; modularity requires that the poly- 
morphic type inferred for a library method must be satisfied by all its overriding 
methods. For an overriding method in a subclass, if the inference algorithm 
generates constraints that are not implied by the constraints of the superclass 
method, then the unit must be rejected. 

^ It is possible that because we are not doing full type inference, polymorphic recursion 
in this setting is decidable. But we do not yet have results either way. 
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To cope with such a situation, there are a couple of approaches one may 
adopt. Because changing library code makes the inference process non- modular, 
one can change the code in the subclass, by relabeling field and parameter levels 
with ground constants in a sensible way, and re-run the tool to deliver the relaxed 
signatures. (This will be illustrated by an example in section 4.2). A more prac- 
tical approach is that during library design, the designer may want to consider 
anticipated uses of those library methods that are expected to be overridden in 
subclasses. The inferred signature of such library methods - an extreme example 
being abstract methods with no implementation - may be too general; hence, 
the designer may want to make some of the field types and method signatures 
more specific. Then, there would be more of a possibility that the constraints in 
the method signatures of library methods will imply those in the signatures of 
the overriding methods. Thus the security type signature we assume for library 
methods is allowed to be an arbitrary one. 

Contributions and Overview. This paper tackles all of the issues above. Our 
previous work [3] did not cope with libraries. Here, we do; thus we provide a 
new security type system for the language with polymorphic classes and methods 
that guarantees noninterference. We provide an inference algorithm that, in the 
context of a polymorphic library, infers security type signatures for methods 
of the current analysis unit. By restricting the current analysis unit to only 
contain monomorphic types, we can show that the algorithm computes principal 
monomorphic types, as justified by the completeness of the inference algorithm 
for such restricted units. Although there have been studies about both type 
inference and information flow analysis for imperative, functional and object- 
oriented programs, we have not found any work that addresses security type 
inference for object-oriented programs in the presence of libraries. We believe 
that the additional details required to account for modularity in the presence of 
method inheritance and override are a novel aspect of our work. 

After discussing two simple examples in section 2, we describe the language 
extended with parameterized classes in section 3, explain the inference algorithm 
in section 4 and give the soundness and completeness theorems in section 5. 
Related work and a discussion of the paper appear in section 6. 

2 Examples 

Consider the following classes: 

class TAX extends Object { 
int income; 

int tax(int salary){ self. income := salary; result := self. income * 0.20; }} 
class Inquiry extends Object{ 

TAX employee; int est; 

bool overpay(){int tmp:=employee.tax(1000);result:=(tmp> self. est);}} 

The type of method tax in class TAX, written mtype{tax, TAX), is int —>■ int. 
Assuming that income has security level H, a possible security type for method 
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tax, is L, H—{H)-^H-. when the level of the current object^ is L and the level of 
salary is at most H then tax returns a result of level at most H and only H fields 
(the H in the middle) may be updated^ during method execution. This security 
type can be verified using security type checking rules for method declaration 
and commands. 

In security type inference, we infer security types for the level of field income 
and for method tax. We assume that TAX is a well-formed class declaration. 
The inference algorithm is given class TAX as input, but with income’s type 
annotated, e.g., as (int, ai), where oi is a placeholder for the actual level of the 
field. It is also possible for income to be annotated with a constant level (e.g., L 
or H). Apart from annotating fields, we also annotate the parameter and return 
types of methods. In the sequel, we will let letters from the beginning of the 
Greek alphabet range over level variables. 

The type, («2, 0^5— (cr3)-^a4) | < 01,0:3 < 01,02 < oi,oi < 04} is 

inferred for method tax; that is, if the level of the TAX object is at most 02 and 
the level of salary is at most 05 , then the level of the return result is at most 04 
and fields of level at most 03 can be assigned to during execution of tax. The 
first constraint precludes, e.g., assigning ff value to L field. 

The class TAX can now be converted into a library class, parameterized over 
oi and method tax given a polymorphic method signature. 

class TAX<ai>extends Object { 

(int,o;i) income; 

(int, 04) tax((int, 0:5) salary){self.income := salary; 
result := self.income* 0 . 20 ; }} 

Library class TAX<ai> can be instantiated in multiple ways in the analysis 
of another class, for instance, by instantiating ai with a ground level, say H. 
The intention is that for any ground instantiation of ai, . . . , 05 that satisfies the 
constraints, the body of TAX should be typable in the security type system. 

The inference of class Inquiry takes place in the context of the library contain- 
ing TAX. Because TAX is parameterized, the type of field employee is assumed 
to be TAX</ 3 i> and the level of employee is ( 32 . We could have chosen the level 
of employee to he H, but by typing rules, this would prevent access to a public 
field, say name, of employee. Note that the level of est is completely specified. 
The inferred type of method overpay is (ae, Q—iarj-^as) | AT. Some of the con- 
straints in K generated by our inference algorithm are ( 3 i < as and 

/?2 < as- 

Suppose Inquiry is annotated differently, so that the type of the employee 
field is {TAX<H>, P2), be., /?i above has been instantiated to H. Then the 
level of the result of the call to tax (i.e., level as) will also be secret - this was 
predicted by the constraint ( 3 i < as. Suppose Q is an object of type Inquiry 
and suppose that its level is H . Then to prevent implicit information leakage 

^ i.e., the level of self. This information is used to prevent leaks of the pointer to the 
current target object to other untrusted sources[ 3 ]. 

® This information, called the heap effect, is required to prevent leaks due to implicit 
flow via conditionals and method call [ 5 , 3 ]. 
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due to the call to overpay, the return result should be H [3]. This can be seen 
from the constraint ae < og with a^, the level of Q, instantiated to H. Finally, 
if employee itself is H, then the constraint /?2 < Q-s, forces the level of the return 
result to be H as expected. 

Section 3 formalizes the annotated language and discuss security typing rules 
for which noninterference can be shown. Next, section 4 considers a sublanguage 
of the annotated language with programs annotated with level variables; for 
this language we specify an algorithm that infers security types. We discuss 
restrictions of the language to handle undecidability of inference and formalize 
inference for inheritance and override of library methods in the current analysis 
unit in a way that maintains modularity. These restrictions and treatments will 
be illustrated by an example that overrides method TAX. tax in section 4.3. 

3 Language and Security Typing 

We use the sequential class-based language from our previous work [3]. The 
difference is in the annotated language, where classes and methods may be poly- 
morphic in levels. This allows a library class to be used in more than one way. We 
make an explicit separation between a library class and a collection of additional 
classes that are based on the library. 

First, some terms: a unit is a collection of class declarations. A dosed unit 
is a collection of class declarations that is well formed as a complete program, 
that is, it is a class table. The library is a closed unit from which we need its 
polymorphic type signature, encoded in some auxiliary functions defined later. 
A program based on a library can consist of several classes which extend and use 
library classes and which may be mutually recursive. We use the term analysis 
unit for the classes to which the inference algorithm is applied. Due to mutual 
recursion, several classes may have to be considered together. An analysis unit 
must be well formed in the sense that the union of it with the library should 
form a closed unit. 

The Annotated Language. We shall now define the syntax for library units and 
also adapt the security typing rules from our previous work to the present lan- 
guage. Essentially, a polymorphic library is typable if all of its ground instances 
are. 

The grammar is in Table 1. Although identifiers with over lines indicate lists, 
some of the formal definitions assume singletons to avoid unilluminating com- 
plication. 

Since the problem we want to address is secure information flow, all the 
programs are assumed to be well formed as ordinary code; i.e. when all levels are 
erased, including class parameters <A>. Typing rules for our Java-like language 
are standard and can be found in our previous paper [3] . It suffices to recall that 
a collection of class declarations, called a class table, is treated as a function 
CT so that CT{C) is the code for class C. Moreover, field{f,C) gives the type 
of field / in class C and mtype{m, C) gives the parameter and return types for 
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Table 1. Language grammar 

T bool I C (where C ranges over ClassN ame) 

K H j L (level constants) 

A :;= a I K (level variable, constant) 

U :;= T<\> (we also use W and R for this category) 

CL class C<a> extends [/{([/, A) /; M} 

M ■.:=U m (I7x){5'} 

S ■.:= X := e \ e.f := e | x := new U \ x := e.m(e) | S'; S | t/ x := e in S | 
if e then S else S 

e :;= x | null | true | e.f \ e == e | e is [7 | {U)e 



method m declared or inherited in C. Subtyping is invariant: D < C implies 
mtype{m,C) = mtype{m, D). 

We use T to represent an ordinary data type, while U ranges over parame- 
terized class types, which take the form T<\>. Declaration of a parameterized 
class binds some variables a in the types of the superclass and fields: 

class C<a> extends D<X>{{U , X') f;M} 

All variables appearing in field declarations are bound at the class level. Thus the 
parameterized class declaration above must satisfy a well formedness condition: 
a A var{X) Uvar{U) Uvar(X'). These variables, if appear in method declaration 
and body, are also bound in class level. The rest free variables are bound in 
method level for method polymorphism. 

Field types, including security label, can be retrieved by a given function 
lsfield{f, U) which returns the appropriate type of field / in a (possibly instan- 
tiated) class U. Thus for the declaration displayed above, lsfield{f,C<a>) = 
{U, A'), and for any A" of the right length, lsfield{f, C<A">) = ([/, A')[a ^ A"]. 
We require that lsfield{f,U) is defined iff field{f,T) is defined, and moreover"^ 
\iU ^U' and lsfield{f, U') is defined, then lsfield{f, U) = lsfield{f, U'). 

Method types, possibly polymorphic, need more delicate treatment. Types 
for methods are given using a signature function Ismtype so that lsmtype{m, U), 
for method m declared or inherited in class U returns m’s signature and a set 
K of constraints in the form of inequalities between constants and variables. 
We require that lsmtype{m,T<X>) is defined iff mtype{m,T) is, and in that 
case the signature takes the form Aq, ( 17, Ai)— (A2)— >(i?, A3)|AT. The signature 
expresses the following policy: If the information in “self” is at most Aq and the 
information in parameters is at most Ai with type U, then any fields written 
are at level A2 or higher and the result level is at least A3, with result type R 
provided that the constraints K are satisfied. 

Following our previous work [3], we also require invariance under subclass- 
ing: if U ^ R and m is declared in R, then lsmtype{m,U) = Ismtype {m, R) 
(regardless of whether m is inherited or overridden in U). Analogous to the sub- 

^ The security subtyping relation ^ is defined in Table 2. 
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Table 2. Auxiliary definitions 



For T,T' such that T < ^^ we define: 
instance{T<K>,T) = T<k> 

instance(T<K>,T') = let class T<a> extends T"<X> = CT{T) 

in instance{T" <\\a ^'k]>,T') 



T<k> < T'<k'> 



instance{T<K>,T') =T'<k'> 



tcomp{T<X>,T' <\'>) = let T' <X"> = instance{T<X>,T') 

in Ui{A" < A'; A' < A''} 



classing requirement on methods in object-oriented languages, this is to ensure 
information flow security in the context of dynamic method dispatch. 

The subtyping relation ^ must take polymorphism and information flow 
into account. For built-in type, we define bool ^ bool. Class subtyping can 
be checked using the ^ function defined in Table 2 , which propagates instan- 
tiation of a class up through the class hierarchy. The definition uses another 
function, instance, that carries out this propagation and constructs a suitable 
instantiation of a supertype. The auxiliary definition for downcast appears in the 
appendix. For use in inference, we need to generate a set of constraints such that 
two types with variables are in the ^ relation; this is the purpose of function 
tcomp. We assume that if the analysis unit mentions a parameterized class, it 
provides the right number of parameters. 

Security Typing Rules. Although the typing rules work with parameterized 
class declarations and with polymorphic method signatures, the typing rules 
for expressions and commands in method bodies only apply to ground judge- 
ments. A security type context Z\ is a mapping from variable names to security 
types. We adopt the notation style for typing judgements from [ 3 ]. A judgement 
Z\ h e : {U,n) says that expression e in context A has security type (U,k). A 
judgement A \- S : com ki,K 2 says that, in the context A, command S writes 
no variables below ki, which is in the store and will be gone after the execu- 
tion of the method, and no fields below K2 , which will stay in the heap until 
garbage-collected . 

We give the rule for method call, x := e.m(e), below. It uses the polymorphic 
signature function of the method m, and requires that there must be some satis- 
fying ground instance compatible with the levels at the call site; this is ensured 
by requiring satisfiability of a constraint set K' which contains the constraints 
needed to match security types of parameters and arguments. 

A, X : {U, k) \- e : (W, K3) 

A, X : {U, k) h e : (U,'K4) lsmtype{m, W) = Aq, {U', X)—{Xi)-^{U', A2)|AT 
K5 < K K3 < K K' = K \J K" U tcomp{U' , U) U tcomp{U' , U) 

K" = {7^ < A, A2 < K, K6 < Ai, K3 < Aq, K3 < Ai} K' is satisflable 
A, X : {U, k)\- X := e.m(e) : com K5, kq 
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Table 3. Typing rules for method declarations and class declarations 

lsmtype(m, C <k' >) = Ao, (U , \ii)\K 

V = vars^o,(U,\)-{\3)^{U,\i)\K) 

for all I with ok(K, V, I) : let ko, (U' ,k)—{k 3)-^{U' , Kt) = I{Xo, {U, A)— (A3)— >(tf, Ar)) in 
X : {U',k), self : (C, kq): result : {U , K4) h S : com ki, K2 K3 < K2 
C<k'> extends Rh U m{U lc){S} 

for all M (E M, all k: C<k> extends I{R) h I{M) where / = [<5 ^ k] 

h class C<a> extends R{U f',M} 



Table 3 gives the rules for class and method declarations. A class declaration 
is typable provided that all of its method declarations are. Typing a method 
declaration requires checking its body with respect to all ground instantiations, 
I, over the variables V given by Ismtype. We define ok{K, V, I) to mean that I 
satisfies K. 

Noninterference. Like FlowCaml [15], our system uses a level-polymorphic lan- 
guage, both for more expressive libraries and because it is the natural result 
from inference. The noninterference property asserted by a polymorphic type 
is taken to be ordinary noninterference for all ground instances that satisfying 
the constraints that are part of the type. By lack of space in this paper, we 
omit the semantics and thus cannot formally define noninterference. Informally, 
a command is noninterfering if, for any two initial states that are indistinguish- 
able for L (i.e., if all H fields and variables are removed), if both computations 
terminate then the resulting states are indistinguishable. Indistinguishability is 
defined in terms of a ground labeling of fields and variables. A method declara- 
tion is noninterfering with respect to a given type if its body is noninterfering, 
where the method type determines levels for parameters and result. A class table 
is noninterfering if, for every ground instantiation of every class, every method 
declaration is noninterfering. 

The forthcoming technical report [18] shows that if a class table is typable 
by the security rules then it is noninterfering. 

4 Inference 

In this section we give the complete inference process. The algorithm has two 
steps. In the first step (sections 4.1 and 4.2) it outputs the constraints that en- 
sure the typability of the classes being checked. In the second step (section 4.3), 
it takes the output from the first step, produces the parameterized signatures, 
and checks the subclassing invariance of these signatures. Then the new param- 
eterized signatures can be added to the library. 

4.1 Input 

One input to the inference algorithm is the pair of auxiliary functions giving 
the polymorphic signatures of a library, namely, Isfield for fields and Ismtype for 
methods. 




92 



Qi Sun, Anindya Banerjee, and David A. Naumann 



The other input is the current analysis unit. Unlike library methods, all meth- 
ods implemented in the analysis unit are treated monomorphically with respect 
to each other during the inference, even though they may override polymorphic 
methods in the library. In particular, although we do not have explicit syntax 
for mutual recursion®, mutually recursive classes are put in the same analysis 
unit and are treated monomorphically. Method bodies can of course instantiate 
library methods differently at different call sites. 

For any set V of variables, we write for some fixed renaming that maps 
V to distinct variables not in V. 

The signature functions, usfield and usmtype, provide the types of fields and 
methods for classes in unit. We refrain from defining the simpler one, usfield. 
For any set V of level variables, define usmtype{m,T,V) as follows: 

1 . If T has a declaration of m 

— If T has a superclass U in unit that declares to, then usmtype{m, T, V) = 
usmtype {m, U,V). 

— Otherwise (i.e, any superclass of T that declares to is in the library), 
usmtype (m,T,V) has parameter types and return type as declared in 
T ; the heap effect and self level are two variables distinct from all level 
variables in unit and the signature has the empty constraint set. 

2. If T inherits to from its superclass U 

— If [/ is in unit, usmtype{m, T, V) = usmtype{m, U, V). 

— If [/ is a library class, usmtype{m,T, V) = I*^ lsmtype(m, U). 

By definition, usmtype may return a type that is either monomorphic or poly- 
morphic, depending on whether there are any declarations of to in unit at or 
above T. If there are none, the method type is polymorphic and a renaming is 
needed to ensure variable freshness. On the other hand, if there is a declaration 
of TO in unit at or above T, usmtype returns a fixed monomorphic type for all 
call sites - even if to has also been defined in the library. 

4.2 Inference Rules 

The inference algorithm is presented in the form of rules for a judgment that gen- 
erates constraints and keeps track of variables in use in order to ensure freshness 
where needed. For expressions, the judgment has the form 

A,V'r e:U a,K,V' 

where V, V are sets of level variables, with V QV' . The judgment means that 
in security type context A, expression e has type U and level a provided the 
constraints in K are satisfied. Each rule also has a condition to ensure freshness 
of new variables, e.g., a . The constraints K may be expressed using other 
new variables; V collects all the new and existing variables. The correctness 
property is that any ground instantiation I of V that satisfies K results in an 

® In contrast with explicit syntax for mutual recursion, say, in ML. 
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Table 4 . Inference algorithm: selected command cases 

A,x : (t/,Ai) 'r e-.U' a2,Ki,Vi 
K' = KiU {tt2 < Ai, as < Ai} U tcomp{U' , U) as, ^ Vi 
A, X : {U, Ai), V \- X := e com (as, ua), K' , {as, 04} U V\ 

( (Ao,(t7^,A)-(AiHi?,A2)|-R' = usmtype{m,W,Vi) 

V (Ao,(IT,A)-{AiHi?,A2)|-K' = 7*'^^(temt2/pe(m,WO) ) 

A,x:{U,\),V^e:W^ a3,Ko,Vo A,x : {U,\),Vo \~ e :U ^ Ki,Vi 
V” = V\yj var(K) U var{R) U var{U') U var(Ao, A, Ai, A2) 

K' = KuKoUKiU tcomp(U,TP) U tcomp{R, U) U K” 

K = |a4 < A, as < Ao,as < Ai,A2 ^ A, as < A,ae < Ai,as ^ A} 

a5,a6^V" C' = {a5,a6}uy" 

A, X : (U, X),V \- X := e.m(e) com (as, ae), K' , V' 



expression typable in the security type system, once we instantiate a and the 
other variables. This is formalized in the soundness theorem. 

There is a similar judgment for commands: Z\, h S' com (ai, 02), K' , V 
where V, V are level variables with V QV' means that in security type context 
A, command S writes to variables of level ai or higher and to fields of level 02 
or higher. 

We refrain from giving the full set of rules, but discussing just a few cases, 
which are given in Table 4 . The first rule in the table is for variable assignment. 
This may help the reader become familiar with the notation. The inferred type 
for the assignment is com (0:3,04), where 03,04 are fresh. The generated con- 
straint set K' contains the set K\ obtained during the inference of e; U' is the 
type of e and 02 is the inferred level of e. As expected from the typing rule 
for assignment, K' contains the constraint set {02 < Ai,03 < Ai} and also the 
constraints between variables in U' and U generated by tcomp{U\U), which 
ensures the subtyping relation between U' and U. 

The most complicated rule is for method invocation (Table 4 ) . It is developed 
from the typing rule for method invocation(Table 3 ). One can see that the con- 
ditions in the typing rule evolve to the constraints in the inference rule. There 
are two cases depending on the static type of the target. If the target is defined 
in the library, Ismtype will return the polymorphic method type and a renaming 
I*^ is used in the rule for freshness. Otherwise, usmtype returns the appropriate 
method signature, already renamed if necessary. In both cases, the type will be 
matched against the calling context and constraints in the returned signature 
will be integrated. The rule uses tcomp to generate constraints that ensure type 
compatibility. 

The rules apply by structural recursion to a method body, generating con- 
straints for its primitive commands (like assignment and method call) and con- 
straints for combining these constituents (like in if/else). The rule for method 
declaration, first rule in Table 5 , matches a method body with its declared type 
and checks it, generating an additional constraint. 
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Table 5. Inference algorithm for method declaration and class declaration 

A, T h S' com (ai, 02 ), Ai, Vi A = ; (17, A); self : (17, Ao); result : (A, A 4 )] 

usmtype(m, C, V\) = (Ao, (17, \a)\0 

C extends D,V \~ R m{U x){S} Ai U {A 3 < 02 }, Vi 

WMi £M C extends D, Vi-i \~ Mi ^ Ki, Vi 
Vo class C extends D{17 f,M} Vi<i<„Ki,V„ 



The rule for class declaration, also in Table 5, combines the constraints for all 
its methods. We refrain from stating a formal rule for the complete analysis unit. 
The conclusion, written Ismtype, Isfield, usmtype, usfield h unit K,V depends 
on two hypotheses. First, each class declaration in unit has been checked by 
the rule in the table, yielding constraints K over variables V . This check is 
obtained by enumerating the class declarations in unit, threading variable sets 
from one class to the next, and then taking for K the union of the constraints. 
The initial variable set contains all the fresh variables used in the definition of 
usmtype and all level variables that occur in unit. The second hypothesis is that 
overriding declarations do not introduce new constraints, which would invalidate 
the analysis of the library which is assumed in the form of Ismtype. If this check 
fails, the analysis fails. We will address the check at the end of section 4.3. 

4.3 Building a New Library 

In this subsection we illustrate the manipulation of parameterized classes, result- 
ing in a new library signature. Then we give the definitions. Finally, we outline 
how subclassing invariance is checked. 

Producing New Signatures. Assume we define a class CreditTAX that extends 
TAX. We have filled in level variables where needed. 

class CreditTAX extends TAX<ji> { 

{int, 7 o{ credit; 

{int, 72 ^ tax((int, 'js) salary){ 
self.income:=salary; result:=self.income*0. 2-seif. credit;}} 

Assume usmtype (tax, CreditTAX, {70, 7ij 72, 73}) returns 

(7s, (mt, 73))— (7/i)-^-(int, 72)10. We run the program on the code, and get 
the output K,V, where V includes 70,71,72,73 and other temporary level 
variables generated during the inference. 

To put CreditTAX into the library, we need to produce its signature. First 
we define the list of formal parameters by collecting variables 70 from field dec- 
larations and 7i from the “extends” clause. Second, we attach the generated 
constraint K to each method in the unit. The converted signature for Credit- 
TAX, in pseudocode, is: 
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class CreditTAX<7o, 7i> extends TAX <71 > { 
credit: ('int, 70J; 

tax: (7s, (int, 72))-(7/j)-^(int, 73) \ K ;} 

Now we formalize the process of producing new signatures. By the algorithm 
we can get {K, V) on the classes in the unit. For converting the code, let X be 
the set of class names declared in unit. We study any class C G X. Let V' be 
all the variables in V that appear in the supertype or field type/label of C. Let 
unit' be unit but with every C in X replaced by C<V'>. The unit' is now a 
parameterized class with polymorphic methods. 

Now we need to combine the signatures from the library and the unit. Based 
on unit', we will build a new signature function that can access the converted 
unit and the library uniformly. Assume unit'(T) = class T<a >{. . .}. 

fieldmerge {Isfield, usfield){f,T<X>) = 

if T in unit then usfield{f,T)\a ^ A] 
else lsfield{f,T<X>) 

methmerge {Ismtype, usmtype, K){m,T<X>) = 

if m is inherited from class D in the library then 
lsmtype{m, instance(T<X>, D)) 
else {fst{usmtype{m,T,0))\K)\a ^ A] 

In methmerge, T.m is implemented in the unit. So the third parameter for 
usmtype is insignificant and the constraint in the return of usmtype{m,T, 0) is 
empty. We use fst to strip off this empty constraint. 

Checking Method Declarations for Proper Override. Rather than delving into 
algorithmic optimizations, we just specify the check for overriding declarations 
informally. We want to ensure that U.m properly overrides U' .m where U' is a 
super class of U . We assume the constraint set has been simplified in that only 
level constants and variables that are in the formal class parameter list or method 
type signature are kept. For example, {ce < (3,(3 < j} can be transformed into 
{a < 7} if /3 is insignificant. 

The condition for proper override can be expressed as: Every constraint 
in the overriding method must be entailed [13] by the constraints in the over- 
ridden method. For example, assume lsmtype(U,m) = {L, ()—{a2)-^a3)\K and 
lsmtype{U' ,m) = {(3i, {)—{(32)-^P3)\K'. We want to check if U.m is properly im- 
plemented. Since the level of self is L for U.m, (3\ = L should be entailed by K' . 
Also, if «2 < 0:3 G K, K' should entail it too. 

We return to the CreditTAX example. It is not difficult to figure out that K 
is the same as the constraint set (after the name conversion) in TAX. tax except 
that there is one more inequality, 70 < 73, in K. This is necessary to ensure 
the typability of CreditTAX, but it makes the method tax more restrictive than 
declared in TAX. When tax is invoked on a CreditTAX object as an instance 
of TAX, the caller may assume 70 = iL, 73 = L as a valid precondition because 
TAX. tax does not impose any constraint between 70 and 73. But this constraint 
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is obviously unsatisfiable for CreditTAX.tax in the context of dynamic dispatch, 
and violates the underlying policy. To make CreditTAX pass the check 70 < 73, 
one can relabel field credit with L. 

We only compare constraints for a particular method ~ it is certainly not the 
case that the constraints from the library imply all constraints for unit, e.g., the 
unit can have additional methods. 

Complexity. The time/space cost for the inference algorithm to generate con- 
straints is low-order polynomial in the size of the program, and independent 
of the security lattice. We can show that the time to generate the constraint 
set is 0{mn(s + where m is the number of methods in the unit; n is 

the length of the unit; s, t are the number of distinct variable in class level and 
method level, perspectively; |P| is the size of the permission set. The size of the 
generated constraint set is 0{n{s + 

5 Soundness and Completeness 
of the Inference Algorithm 

5.1 Soundness 

Theorem 1 (Soundness of inference algorithm). 

Assume sigs h unit K,V ®. Let unit' be the converted unit and sfield = 
fieldmerge {Isfield , us field) and smtype = methmerge{lsmtype, usmtype, K) be 
the converted signatures, then sfield, smtype h unit' . 

5.2 Completeness of Inference Algorithm 

In our system, the most general signatures of mutually recursive classes cannot 
be represented in finite forms. Thus the inference algorithm cannot be complete, 
since our algorithm will always terminate and produce finite output. We have to 
restrict the classes in current analysis unit in order to prove completeness. 

Define a unit to be monomorphically typed if all type references and method 
invocations for the same class or method in a class body are instantiated exactly 
in the same way. 

Theorem 2 (Completeness). 

If I {unit) is monomorphically typed in I {sigs), the constraints produced by 
the algorithm for unit are satisfiable by an extension of I. 

In other words, this means that the algorithm yields principal types for a mono- 
morphically typed unit with respect to the polymorphic library. This is analogous 
to type inference of recursive functions in ML. For example, in the ML term, 
letrec f{x) = t\ in t^, all occurrences of / in t\ are monomorphic. The current 
unit is comparable to ti, and ^2 is comparable to classes in other units that 
can use current unit polymorphically once it has been made part of a library. 
The theorem relies on lemmas for expressions, commands, method and class 
declarations. We only list the lemma for expressions and commands. 



We use sigs to abbreviate Isfield, Ismtype, usfield, usmtype. 
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Lemma 1. Assume /(e) is monomorphically typed in I{sigs). If I{sigs), I{A) h 
/(e) : Uc, K and sigs, A,V \~ e \ U a, K ,V' where Uc is a type parameterized 
over level constants, then 31' A I . ok{I', K, V) A k = I' {a) AUc = I'{U) 

Lemma 2. Assume /(S') is monomorphically typed in /(sfgs). If /(sfgs), /(Z\) h 
/(S) : com Ki, K2 and sig, A,V \~ S (com ai,a2), K', V, then 

31' A / . ok{I', K', V) A Ki = /'(«i) A K 2 = /'(o 2 )- 

6 Related Work and Discussion 

Related Work. Volpano and Smith [19], give a security type system and a 
constraint-based inference algorithm for a simple procedural language. The type 
system guarantees noninterference: a well-typed program does not leak sensi- 
tive data. The inference algorithm is sound and complete with respect to the 
type system. However, they do not handle object-oriented features, and their 
suggestion to handle library polymorphism by duplicating code is impractical. 

Myers [9, 10] gives a security type system for full Java, but leaves open the 
problem of justifying the rules with a noninterference result. Myers, Zdancewic 
and their students have implemented a secure compiler, Jif^, that implements 
the security typing rules. Jif handles several advanced features like constrained 
method signature, exceptions, declassification, dynamic labels and polymor- 
phism. Jif’s inheritance allows overriding methods to be more general than over- 
ridden methods, which means that the constraints in the overridden method 
must be stronger than the overriding method. However, inference in the system 
is only intraprocedural. Field and method types are added either manually or 
by default. 

Simonet presents a version of ML with security flow labels, termed Flow- 
Caml[16, 15] which supports polymorphism, exceptions, structural subtyping and 
the module system. The type system is polymorphic and has been shown to en- 
sure noninterference. Simonet and Pettier [12] give an algorithm to infer security 
types. They also prove soundness of type inference. 

There is a rich literature on type inference for object-oriented programs [20, 
11,2,4,21]. However, we are interested in security type inference, rather than 
full type inference; we assume that a well-typed program is given. We found it 
difficult to adapt the techniques in these works because they do not consider 
modular inference in the presence of libraries. 

We have a working prototype for a whole program analysis for the language 
in [3]. It accepts a class declaration that is partly annotated with level constants, 
generating a constraint set and checking its satisfiability. If the code is typable, 
the output will be a polymorphic type for the given program in its most general 
form. The extension of the prototype for the present paper is currently under 
way. 



On the web at http://www.cs.cornell.edu/jif/ 
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Deployment Model. For an application developer, the signatures in the library 
specify security requirements. The developer must annotate additional methods 
in the current analysis unit with new policies. Running a check on the annotated 
program can then tell whether it is secure with respect to the library policies. 

For library designers, the tool is helpful in that it not only enforces the speci- 
fied security policies, but also gives designers a chance to revise the result signa- 
tures if the signatures appear too general and seem likely to prevent subclasses 
from being implemented because subclasses cannot introduce new flows. 

To make the result signatures more general for a collection of classes, it 
is advisable to make the analysis unit as small as possible. Classes that make 
mutually recursive references need to be analyzed together. This is the only 
reason to make units have more than one class. 

Conclusion. The main contribution of this paper is the specification of a modular 
algorithm that infers security types for a sequential, class-based, object-oriented 
language. This requires the addition of security level variables to the language 
and moreover, requires classes parameterized with security levels. The inference 
algorithm constructs a library where each class is parameterized by the levels 
in its fields. Each method of a parameterized class can be given a polymorphic, 
constrained signature. This has the additional benefit of being more expressive 
and flexible for the programmer. We have given soundness and completeness 
theorems for the algorithm and work is in progress on a prototype. We have 
not yet experimented with the scalability of our technique to real sized pro- 
grams. Such an experiment and its results will be reported in the first author’s 
dissertation. Our work would also benefit from a comparison with the HM(X) 
constraint-based type inference framework [17]. Our suspicion, however, is that 
to prove soundness and completeness, there might be substantial overhead in 
the translation of our security types to the HM(X) framework. 

References 

1. Martin Abadi. Secrecy by typing in security protocols. Journal of the ACM, 
46(5):749-786, September 1999. 

2. Ole Agesen. The cartesian product algorithm: Simple and precise type inference of 
parametric polymorphism. In European Conference on Object Oriented Program- 
ming (ECOOP), pages 2-26, 1995. 

3. Anindya Banerjee and David A. Naumann. Secure information flow and pointer 
confinement in a Java-like language. In IEEE Computer Security Foundations 
Workshop (CSFW), pages 253-270. IEEE Computer Society Press, 2002. 

4. Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler. Making 
the future safe for the past: Adding genericity to the Java programming language. 
In Craig Chambers, editor, ACM Symposium on Object Oriented Programming: 
Systems, Languages, and Applications (OOPSLA), pages 183-200, Vancouver, BC, 
1998. 

5. Dorothy Denning and Peter Denning. Certification of programs for secure infor- 
mation flow. Communications of the ACM, 20(7):504-513, 1977. 




Modular and Constraint-Based Information Flow Inference 



99 



6. J. Goguen and J. Meseguer. Security policies and security models. In Proceedings 
of the 1982 IEEE Symposium on Seeurity and Privacy, pages 11-20, 1982. 

7. Fritz Henglein. Type inference with polymorphic recursion. ACM Transactions on 
Programming Languages and Systems, 15(2):253-289, April 1993. 

8. Alan Mycroft. Polymorphic type schemes and recursive definitions. In Sixth Inter- 
national Symposium on Programming, number 166 in Lecture Notes in Computer 
Science. Springer- Verlag, 1984. 

9. Andrew C. Myers. JFlow: Practical mostly-static information flow control. In 
ACM Symposium on Principles of Programming Languages (POPL), pages 228- 
241, 1999. 

10. Andrew C. Myers. Mostly-Static Decentralized Information Flow Control. PhD 
thesis. Laboratory of Computer Science, MIT, 1999. 

11. Jens Palsberg and Michael I. Schwartzbach. Object-oriented type inference. In 
ACM Symposium on Object Oriented Programming: Systems, Languages, and Ap- 
plications (OOPSLA). ACM Press, 1991. 

12. Frangois Pettier and Vincent Simonet. Information flow inference for ML. In 
ACM Symposium on Principles of Programming Languages (POPL), pages 319- 
330, 2002. 

13. Jakob Rehof and Fritz Henglein. The complexity of subtype entailment for simple 
types. In Proceedings LICS ’97, Twelfth Annual IEEE Symposium on Logic in 
Computer Scienee, Warsaw, Poland, June 1997. 

14. Andrei Sabelfeld and Andrew C. Myers. Language-based information-flow security. 
IEEE J. Selected Areas in Communications, 21(1):5-19, January 2003. 

15. Vincent Simonet. Flow Caml in a nutshell. In Graham Hutton, editor, Proceedings 
of the first APPSEM-II workshop, pages 152-165, March 2003. 

16. Vincent Simonet. The Flow Caml System: documentation and user’s manual. 
Technical Report 0282, Institut National de Recherche en Informatique et en Au- 
tomatique (INRIA), July 2003. 

17. Christian Skalka and Frangois Pottier. Syntactic type soundness for HM(X). In 
Proceedings of the Workshop on Types in Programming (TIP ’02), volume 75 of 
Electronic Notes in Theoretical Computer Science, July 2002. 

18. Qi Sun, Anindya Banerjee, and David A. Naumann. Constraint-based security 
flow inferencer for a Java-like language. Technical Report KSU CIS TR-2004-2, 
Kansas State University, 2004. In preparation. 

19. Dennis Volpano and Geoffrey Smith. A type-based approach to program secu- 
rity. In Proceedings of TAPSOFT’97, number 1214 in Lecture Notes in Gomputer 
Science, pages 607-621. Springer- Verlag, 1997. 

20. Mitchell Wand. Gomplete type inference for simple objects. In Proe. 2nd IEEE 
Symposium on Logic in Computer Scienee, pages 37-44, 1987. 

21. Taejun Wang and Scott Smith. Precise constraint-based type inference for java. 
In European Conference on Object Oriented Programming (ECOOP), 2001. 




Information Flow Analysis in Logical Form 



Torben Amtoft and Anindya Banerjee* 

Department of Computing and Information Sciences 
Kansas State University, Manhattan KS 66506, USA 
{tamtof t , ab}@cis . ksu . edu 



Abstract. We specify an information flow analysis for a simple impera- 
tive language, using a Hoare-like logic. The logic facilitates static check- 
ing of a larger class of programs than can be checked by extant type-based 
approaches in which a program is deemed insecure when it contains an 
insecure subprogram. The logic is based on an abstract interpretation 
of program traces that makes independence between program variables 
explicit. Unlike other, more precise, approaches based on a Hoare-like 
logic, our approach does not require a theorem prover to generate invari- 
ants. We demonstrate the modularity of our approach by showing that 
a frame rule holds in our logic. Moreover, given an insecure but termi- 
nating program, we show how strongest postconditions can be employed 
to statically generate failure explanations. 



1 Introduction 

This paper specifies an information flow analysis using a Hoare-like logic and 
considers an application of the logic to explaining insecure flow of information 
in simple imperative programs. 

Given a system with high, or secret (H), and low, or public (L) inputs and 
outputs, where L < H is a security lattice, a classic security problem is how to 
enforce the following end-to-end confidentiality policy: protect secret data, i.e., 
prevent leaks of secrets at public output channels. An information flow analysis 
checks if a program satisfies the policy. Denning and Denning were the first to 
formulate an information flow analysis for confidentiality[ll]. Subsequent ad- 
vances have been comprehensively summarized in the recent survey by Sabelfeld 
and Myers [27]. An oft-used approach for specifying static analyses for infor- 
mation flow is security type systems [23,29]. Security types are ordinary types 
of program variables and expressions annotated with security levels. Security 
typing rules prevent leaks of secret information to public channels. For example, 
the security typing rule for assignment prevents H data from being assigned to 
a L variable. A well-typed program “protects secrets”, i.e., no information flows 
from H to L during program execution. 

In the security literature, “protects secrets” is formalized as noninterfer- 
ence [13] and is described in terms of an “indistinguishability” relation on states. 
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Two program states are indistinguishable for L if they agree on values of L vari- 
ables. The noninterference property says that any two runs of a program starting 
from two initial states indistinguishable for L, yield two final states that are in- 
distinguishable for L. The two initial states may differ on values of H variables 
but not on values of L variables; the two final states must agree on the current 
values of L variables. One reading of the noninterference property is as a form of 
(in)dependence [7]: L output is independent of H inputs. It is this notion that 
is made explicit in the information flow analysis specified in this paper. 

A shortcoming of usual type-based approaches for information flow [4, 14, 29, 
24] is that a type system can be too imprecise. Consider the sequential program 
I := h; I := 0, where I has type L and h has type H. This program is rejected 
by a security type system on account of the first assignment. But the program 
obviously satisfies noninterference - final states of any two runs of the program 
will always have the same value, 0, for I and are thus indistinguishable for L. 

How can we admit such programs? Our inspiration comes from abstract 
interpretation [8], which can be viewed as a method for statically computing 
approximations of program invariants [9] . A benefit of this view is that the static 
abstraction of a program invariant can be used to annotate a program with pre- 
and postconditions and the annotated program can be checked against a Hoare- 
like logic. In information flow analysis, the invariant of interest is independence of 
variables, for which we use the notation [x ff w] to denote that x is independent 
of w. The idea is that this holds provided any two runs (hereafter called traces 
and formalized in Section 2) which have the same initial^ value for all variables 
except for w will at least agree on the current value of x. This is just a convenient 
restatement of noninterference but we tie it to the static notion of variable 
independence. 

The set of program traces is potentially infinite, but our approach statically 
computes a finite abstraction, namely a set of independences, T^, that describes 
a set of traces, T. This is formalized in Section 3. We formulate (in Section 4) a 
Hoare-like logic for checking independences and show (Section 5) that a checked 
program satisfies noninterference. The assertion language of the logic is decidable 
since it is just the language of finite sets of independences with subset inclusion. 
Specifications in the logic have the form, {T^} C {Tf}. Given precondition T^, 
we show in Section 6 how to compute strongest postconditions; for programs with 
loops, this necessitates a fixpoint computation^. We show that the logic deems 
the program I := h;l := 0 secure: the strongest postcondition of the program 
contains the independence [Iffh]. 

Our approach falls in between type-based analysis and full verification where 
verification conditions for loops depend on loop invariants generated by a the- 
orem prover. Instead, we approximate invariants using a fixpoint computation. 
Our approach is modular and we show that our logic satisfies a frame rule (Sec- 
tion 7). The frame rule permits local reasoning about a program: the relevant 



^ The initial value of a variable is its value before execution of the whole program. 

^ The set of independences is a finite lattice, hence the fixpoint computation will 
terminate. 
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independences for a program are only those [x ^ w] where x occurs in the pro- 
gram. Moreover, in a larger context, the frame rule allows the following inference 
(in analogy with [21]): start with a specification {T"^} C {Tjf'} describing inde- 
pendences before and after store modifications; then, {T# U xf} C {T^ U xf} 
holds provided C does not modify any variable y, where [y^w] appears in xf". 
The initial specification, {T^} C {X^} can reason with only the slice of store 
that C touches. 

We also show (Section 9) that strongest postconditions can be used to stat- 
ically generate failure explanations for an insecure but terminating program. If 
there is a program fragment C whose precondition contains but whose 

strongest postcondition does not contain [Z # ft.], we know statically that C is an 
offending fragment. Thus we may expect to find two initial values of ft which 
produce two different values of 1. We consider two ways this may happen [11]; we 
do not consider termination, timing leaks and other covert channels. One reason 
for failure of [I # ft] to be in the strongest postcondition, is that C assigns H data 
to a L variable. The other reason is that C is a conditional or a while loop whose 
guard depends on a high variable and which updates a low variable in its body. 
Consider, for example, if ft then I := 1 else I := 0. Our failure explanation for 
the conditional will be modulo an interpretation function, that, for distinct vari- 
ables hi and ft -2 map hi to true and ft -2 to false. Under this interpretation, the 
execution of the program produces two different values of 1. This explains why 

1 is not independent of ft. Because we use a static analysis, false positives may 
be generated: consider if ft then I := 7 else Z := 7, a program that is deemed 
insecure when it is clearly not. However, such false positives can be ruled out by 
an instrumented semantics that tracks constant values more precisely. 

Contributions. First and foremost, we formulate information flow analysis in a 
logical form via a Hoare-like logic. The approach deems more programs secure 
than extant type-based approaches. Secondly, we describe the relationship be- 
tween information flow and program dependence, explored in [1, 16], in a more 
direct manner by computing independences between program variables. The in- 
dependences themselves are static descriptions of the noninterference property. 
In Section 8, we show how our logic conservatively extends the security type 
system of Smith and Volpano [29] , by showing that any well-typed program in 
their system satisfies the invariant [Iffh]. Thirdly, when a program is deemed 
insecure, the annotated derivation facilitates explanations on why the program is 
insecure by statically generating counterexamples. The development in this pa- 
per considers termination-insensitive noninterference only: we assume that an 
attacker cannot observe nontermination. Complete proofs of all theorems appear 
in the companion technical report [2] . 

2 Language: Syntax, Traces, Semantics 

This section gives the syntax of a simple imperative language, formalizes the 
notion of traces, and gives the language a semantics using sets of traces. 
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Syntax. We consider a simple imperative language with assignment, sequencing, 
conditionals and loops as formalized by the following BNF. Commands C G 
Cmd are given by the syntax 

C ::= X := E \ Cl ]C2 \ 'i-i E then C\ else C2 | while E do C 

where Var is an infinite set of variables, x, y, z,w G Var range over variables 
and where E G Exp ranges over expressions. Expressions are left unspecified 
but we shall assume the existence of a function fv(E) that computes the free 
variables of expression E. For commands, fv(C) is defined in the obvious way. 
We also define a function modified : Cmd ^ 7 ^ (Var) that given a command, 
returns the set of variables potentially assigned to by the command. 

Traces. A trace t G Trc associates each variable with its initial value and its 
current value; here values v G Val are yet unspecified but we assume that there 
exists a predicate true? on Val. (For instance, we could have Val as the set of 
integers and let true?(v) be defined as r yf 0 ). We shall use T G T^(Trc) to range 
over sets of traces. Basic operations on traces include: 

— ini-t(x) which returns the initial value of x as recorded by t; 

~ cur-t(x) which returns the current value of x as recorded by t; 

— t[y u] which returns a trace t' with the property: for all x G Var, 
ini-t'(a;) = ini-t(a:) and ii x ^ y then cur-t'(x) = cur-t(x); but cur-t'(j/) = v. 

— The predicate initial T on sets of traces T holds iff for all traces t GT, and 
for all variables x, we have ini-t(a:) = cur-t(a:). 

For instance, we could represent a trace t as a mapping Var ^ Val x Val; 
with t{x) = (vi,Vc) we would then have ini-t(a;) = Vi and cur-t(x) = Vc. 

We shall write ti = t2 to denote that cur-ti(x) = cur-t2(a;), and we shall 
write ^{ti = ^2) to denote that t\ = t2 does not hold. Also, we shall write ^1=^2 
to denote that for y ^ x, ini-ti(y) = ini-f2(j/) holds. That is, the initial values of 
all variables, except for x, are equal in ti and ^2- 



Semantics. We assume that there exists a semantic function |E] : Trc ^ Val 
which satisfies the following property: if for all x G fv(E) we have ti = t2, 
then |E](ti) = |E](t2). The definition of |E] would contain the clause |a;](t) = 
cur-t(a;). For each T and E we define 

E-true(T) = {t G T | true?(|E](t))} 

E-false(T) =T\ E-true(T). 

The semantics of a command has functionality |C] : T^(Trc) ^ T^(Trc), and is 
defined in Fig. 1 . To see that the last clause in Fig. 1 is well-defined, notice that 
is a monotone function on the complete lattice T^(Trc) ^ T^(Trc). 
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[x ■- Ej = XT.{t' \ 3teT :t' = t[x^ [-^1(1)]} 
[Cl ;C2l = AT.[C2 ]([Ci1(T)) 



[if E then Cl else C2I = AT.[Cil(J5-true(T)) U |C2l(S-false(T)) 

[while E do Co] = Ifp(E^) where C = while E do Co and 

: (P(Trc) ^ P(Trc)) ^ (P(Trc) -> P(Trc)) 
= AT./([Co|(£;-tme(T))) U S-false(T) 



Fig. 1. The Trace Semantics. 



3 Independences 

We are interested in a finite abstraction of a (possibly infinite) set of concrete 
traces. The abstract values are termed independences: an independence G 
Independ = 7^(Var x Var) is a set of pairs of the form denoting that 

the current value of x is independent of the initial value of w. This is formalized 
by the following definition of when an independence correctly describes a set of 
traces. The intuition is that x is independent of w iff any two traces which have 
the same initial values except on w must agree on the current value of x] in other 
words, the initial value of w does not influence the current value of x at all. 

Definition 1. [x^w] \= T holds iff for all ti,t 2 G T: t\ =t 2 implies t\ =t 2 - 
Tit 1= T holds iff for all [xffw] G it holds that [xffw] |= T. 

Definition 2. The ordering Tf ^ holds iff C Tf . 

This is motivated by the desire for a subtyping rule, stating that if Tf ^ Tf then 
Tf can be replaced by . Such a rule is sound provided Tf is a subset of Tf 
and therefore obtainable from Tf by removing information. Clearly, Independ 
forms a complete lattice wrt. the ordering; let denote the greatest lower 

bound (which is the set union). We have some expected properties: 

_ If y# \^T and Ti C T then T* ^ Ti; 

- if Tf h T and Tf ^ T* then T* ^ T; 

— if for alH G / it holds that Tf |= T, then Ui^jTf |= T. 

Moreover, we can write a concretization function 7 : Independ — > 7^(7^(Trc)): 
y(r^) = {T I \= T}- It is easy to verify that 7 is completely multiplicative. 
Therefore [20, p.237] there exists a Galois connection between 7^(7^(Trc)) and 
Independ, with 7 the concretization function. Finally, we have the following 
fact about initial sets of traces. 



Fact 1 For all T, if initial T then [xffy] \= T for all x ^ y. 
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4 Static Checking of Independences 

To statically check independences we define, in Fig. 2, a Hoare-like Logic where 
judgements are of the form G h {Tf^} C {Tf}. The judgement is interpreted 
as saying that if the independences in Tf hold before execution of C then, 
provided C terminates, the independences in will hold after execution of 
C. The context G € Context = T^(Var) is a control dependence, denoting (a 
superset of) the variables that at least one test surrounding G depends on. For 
example, in if x then y := 0 else z := 1, the static checking of y := 0 takes 
place in the context that contains all variables that x is dependent on. This is 
crucial, especially since x may depend on a high variable. 

We now explain a few of the rules in Fig. 2. Checking an assignment, x := E, 
in context G, involves checking any [y w] in the postcondition . There are 
two cases. If a; yf y, then [yffw] must also appear in the precondition . Oth- 
erwise, if a: = y then [a: w] appears in the postcondition provided all variables 

referenced in E are independent of w; moreover, w must not appear in G, as 
otherwise, x would be (control) dependent on w. 

Checking a conditional, if E then Gi else G 2 , involves checking Gi and G 2 
in a context Gq that includes not only the “old” context G but also the variables 
that E depends on (as variables modified in Gi or G 2 will be control dependent 
on such). Equivalently, if w is not in Go, then all free variables a: in if must be 
independent of w, that is, [xffw] must appear in the precondition T^. 

Checking a while loop is similar to checking a conditional. The only difference 
is that it requires guessing an “invariant” that is both the precondition and 
the postcondition of the loop and its body. 

In Section 6, when we define strongest postcondition, we will select Go = GU 
{w I 3a: € fv(E) •[xffw] ^ T^} for the conditional and the while loop. Instead 
of guessing the invariant, we will show how to compute it using fixpoints. 

Example 1. We have the derivations 

0h{{[;#/r],[/i#/]}}G=/i{{[/i#;], [/#;]}} and 

^^{{[h#l],[l#l]}}l-.= Q{{[h#l],[l#l],[l#h]}} 

and therefore also 

%^{{[lifh],[hifl]}}l:=h-,l:=Q{{[hifl],[lifl],[lifh]}} 

With the intuition that I stands for “low” or “public” and h stands for 
“high” or “sensitive”, the derivation asserts that if I is independent of h before 
execution, then provided the program halts, I is independent of h after execution. 
By Definition 1 , any two traces of the program with different initial values for h, 
agree on the current value for 1. Thus the program is secure, although it contains 
an insecure sub-program. 

Example 2. The reader may check that the following informally annotated pro- 
gram gives rise to a derivation in our logic. Initially, G is empty, and all variables 
are pairwise independent; we write [xffy,z] to abbreviate [xffy\,[xffz\. 
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ifV[y#H e T*» 

[Assign] G h {TJ^} x:= E {T*} x y ^ [!/#H £ 



x = y^wiGh'iz€ fv(E) • [z=H=w] £ T* 



G h {Tq*} Cl {r*} G'r{T*)G 2 {T*} 

Gh{r#}Ci ;C2 {T*} 



Go h {T*} Cl {T*} Go h {r*} C2 {T*} ^ ^ 



G h {r/} if E then Cl else Go {T#} 



and w ^ Go € fv(E) • [x#w] £ T* 



rwhii.i Go b {T*} G {r*} 

‘ ’ Gh{r#} While Edo G{T#} ,,din^G:^V.£fv(E).[^#HeT# 

[Sub] ^ jr*} G {T*} 

Fig. 2. The Hoare Logic. 



[h^l,x], [x^l,h]} 
x:=h {[l#h,x]\h#l,x],[x#l,x]} 

if a; > 0 (G is now {/i}) 

then I := 7 {[I x,l],[h=ff I, x],[x l,x]} 

else x:=0 {[l#h,x],[h#l,x],[x#l,x]} 

end of ±f {[Z#x], [x#/,x]} 

A few remarks: 

— in the preamble, only x is assigned, so the independences for I and h are 
carried through, but [x^l,x] holds afterwards, as [h^l,x] holds beforehand; 

— the free variable in the guard is independent of I and x but not of h, implying 
that h has to be in G. 



5 Correctness 

We are now in a position to prove the correctness of the Hoare logic with respect 
to the trace semantics. 

Theorem 2. Assume that 

G h {Tq^} G where for all [xffy] G , it is the case that x ^ y. 

Then, initial T implies |= |G](T). 

That is, if T is an initial set, then T# correctly describes the set of concrete 
traces obtained by executing command G on T. 

The correctness theorem can be seen as the noninterference theorem for infor- 
mation flow. Indeed, with I and h interpreted as “low” and “high” respectively. 
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suppose [I # h] appears in T^. Then any two traces in |C'](T) (the set of traces 
resulting from the execution of command C from initial set T) that have initial 
values that differ only on h, must agree on the current value of 1. 

Note that the correctness result deals with “terminating” traces only. For 
example, with P = while h ^ 0 do h := 7 and = {[I h],[h=ff /]} we have 
the judgement 0 F {T#} P {T#} (since {h} h {T^} h := 7 {T#}) showing that 
P is deemed secure by our logic, yet an observer able to observe non-termination 
can detect whether h was initially 0 or not. 

To prove Theorem 2, we claim the following, more general, lemma. Then the 
theorem follows by the lemma using Fact 1. 

Lemma 1. If G h {T*} C {T*} and T* ^ T then also T* ^ 1^1 (T). 

6 Computing Independences 

In Fig. 3 we define a function 

sp : Context x Cmd x Independ — > Independ 

with the intuition (formalized below) that given a control dependence G, a com- 
mand G and a precondition T#, sp(G, G, T#) computes a postcondition such 
that G F {T^} G {Tf} holds, and Pf" is the “largest” set (wrt. the subset order- 
ing) that makes the judgement hold. Thus we compute the “strongest provable 
postcondition” , which might differ^ from the strongest semantic postcondition, 
that is, the largest set Pf such that for all T, if T# |= P then Pf |= |C](T). 

In the companion technical report [2] , we show how to also compute “weakest 
precondition”; we conjecture that the developments in Sections 7 and 9 could 
also be carried out using weakest precondition instead of strongest postcondition. 

We now explain two of the cases in Fig. 3. In an assignment, x := E, the 
postcondition carries over all independences [y ff w] in the precondition if y yf x; 
these independences are unaffected by the assignment to x. Suppose that w does 
not occur in context G. Then x is not control dependent on w. Moreover, if 
all variables referenced in E are independent of w, then [x ic] will be in the 
postcondition of the assignment. 

The case for while is best explained by means of an example. 

Example 3. Consider the program 



G = while y do I := X ; X := y ; y := h. 

Let P^ . . .pf" be given by the following table. For example, the entry in the 
column for Pf and in the row for x shows that [xffh] e Pf and [xffl] G Pf . 

® For example, let C = I := h — h and = {[lHh]}. Then [I ^ h] is in the strongest 
semantic postcondition, since for all T and all t £ [C](T) we have cur-t(l) = 0 and 
therefore [l#h] \= [C]T, but not in the strongest provable postcondition. 
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sp{G,x := E,T*) = 

{[y#w]\v^xA[y#w]£ T*} U {[x#w] | w ^ G A Vj/ G fv(E) •\y#w\£ T*} 

sp{G,Ci -,G2,T*) = sp{G,G2,sp{G,Gi,T*)) 

sp{G, if E then Gi else G 2 ,T’^) = 

let Go = G U {w I 3x G fv(E) ,[x#w]<^T*} 

T* = sp{Go,Gi,T*) 

T* = sp{Go,G2,T*) 
in Tf n Tf 

sp(G, while E do Gq,T^) = 

let : Independ Independ be given by (G = while E do Go) 

let Go = G U {w I 3x G fv(E) •[x#w]i T*} 
in sp{Go, Go,T*) n T* 
in lfp{Hf’°) 



Fig. 3. Strongest Postcondition. 
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Our goal is to compute sp(0, C, T^) and doing so involves the fixed point com- 
putation sketched below. 





first 


Iteration 

second 


third 


while y do 

Go : 

1 := X 

x:=y 
y :=h 


-^0 

{y} 

T* 

7^# 

^2 

^3 


Tf = Tf n Tf 
{h,y} 

rnif 

7^# 

-^6 

rnif 

-^6 


Tf = Tf n Tf 
{h,y} 

7^# 

7^# 

^6 



For example, the entry in the column marked “second” and in the second 
row from the bottom, denotes that sp{{h, y}, x := y, Tf) = Tf . 

Note that after the first iteration, [I # h] is still present; it takes a second 
iteration to filter it out and thus detect insecurity. The third iteration affirms 

n 0 

that Tf is indeed a fixed point (of the functional Hq ’ defined in Fig. 3). 

Theorem 3 states the correctness of the function sp, that it indeed computes 
a postcondition. Then, Theorem 4 states that the postcondition computed by sp 
is the strongest postcondition. We shall rely on the following property: 
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Lemma 2 (Monotonicity). For all C, the following holds (for all G,G\, 
T*,T*): 

1. if GC Gi then sp{G, G, T*) < sp{Gi,C, T*); 

2. if T* ^ T* then sp{G, G,T*) ^ sp{G, C, T*). 

Theorem 3. For all G, G, T#, it holds that G h {T^} G {sp{G,G,T'll')} . 

Theorem 4. For all judgements G h {Tf} G {T^}, sp{G, C, T^) ^ T#. 

The following result is useful for the developments in Sections 7 and 9: 

Lemma 3. Given y, C with y ^ modiBed(C) . Then for all T^, G, w: 

[yffw] G T# implies [yffw] G sp{G,G,T^). 

7 Modularity and the Frame Rule 

Define lhs{T"ll") = {y \ [yffw] G T"^}. Then we have 

Theorem 5 (Frame rule (I)). Let and C be given. Then for all G: 

1. If lhs{T*) n modiGed(G) = 0 then sp(G, G, T* U T*) D sp(G, G, T*) U T* . 

2. If IhsiT*) n fv(C) = 0 then sp{G, C, T* U T*) = sp{G, C, T*) U T*. 

Note that the weaker premise in 1 does not imply the stronger consequence in 
2, since (with [z # w] playing the role of T(j^) 

sp{9,x:=y + z,{[y#w]} U {[z#w]}) = {[yffw], [z#w], [x#w]} 
sp{d),x := y + z,{[y#w]}) U {[z#w]} = {[y#w], [z#w]}. 

In separation logic [17,25], the frame rule is motivated by the desire for local 
reasoning: if Gi and C 2 modify disjoint regions of a heap, reasoning about Gi 
can be performed independently of the reasoning about G 2 . In our setting, a 
consequence of the frame rule is that when analyzing a command G occurring in 
a larger context, the relevant independences are the ones whose left hand sides 
occur in G . 

Theorem 5 is proved by observing that part (1) follows from Lemmas 3 and 
2; then part (2) follows using the following result: 

Lemma 4. Let T(f and C he given, with lhs{T(^) n fv(C) = 0. Then for all T^ 
and G, sp{G, G, T* U T*) C sp(G, G, T*) U T*. 

As a consequence of Theorem 5 we get the following result: 

Corollary 1 (Frame rule (II)). Assume that G h G {T^} and that 

lhs{T*) n modiGed(G) = 0. Then G h {T* UT*}G {T* UT*}. 

Proof. Using Theorems 5 and 4 we get sp(G, G, TfuT(^) U sp(G, G, Tf)UT(^ U 
Tf U T^ . Since by Theorem 3 we have G h {Tf U T^} G {sp(G, G, Tf U Tjf')}, 
the result follows by [Sub]. 
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A traditional view of modularity in the security literature is the “hook-up prop- 
erty” [19]: if two programs are secure then their composition is secure as well. 
Our logic satisfies the hook-up property for sequential composition; in our con- 
text, a secure program is one which has [I # h] as an invariant (if [I ^ h] is in the 
precondition, it is also in the strongest postcondition) . With this interpretation, 
Sabelfeld and Sands’s hook-up theorem holds [28, Theorem 5]. 

8 The Smith- Volpano Security Type System 

In the Smith- Volpano type system [29], variables are labelled by security types; 
for example, x : (T, n) means that x has type T and security level k. To handle 
implicit flows due to conditionals, the technical development requires commands 
to be typed (com k ) with the intention that all variables assigned to in such 
commands have level at least k . The judgement T h C : (com k ) says that in 
the security type context F, that binds free variables in C to security types, 
command C has type (com k ). 

We now show a conservative extension: if a command is well- typed in the 
Smith- Volpano system, then for any two traces, the current values of low vari- 
ables are independent of the initial values of high variables. For simplicity, we 
consider a command with only two variables, h with level FI and I with level L. 

Theorem 6. Assume that C can he given a security type wrt. environment 
h : (_, H),l : (_, L). Then for all T* , if [lffh]€ T* then [lffh]& sp{^, C, T*). 

The upshot of the theorem is that a well-typed program has [I ff h] as invari- 
ant: if [Iffh] appears in the precondition, then it also appears in the strongest 
postcondition. 

9 Counter-Example Generation 

Assume that a program C cannot be deemed secure by our logic, that is, 
[Iffh] ^ sp{0,C,T'lh') (where D {[Iffh]}). Then we might expect that we 
can find a “witness” : two different initial values of h that produce two different 
final values of 1. However, below we shall see three examples of false positives: 
programs which, while deemed insecure by our logic, do not immediately satisfy 
that property. Ideally, we would like to strengthen our analysis so as to rule 
out such false positives; this does not seem immediately feasible and instead, in 
order to arrive at a suitable result, we shall modify our semantics so the false 
positives become genuine positives. The programs in question are: 

l-.= h-h. (1) 

if h then I := 7 else I := 7 (2) 

while h do I := 7 (3) 

To deal with (1), a program where writing a high expression to a low variable 
does not reveal anything about the high variable, we shall assume that expres- 
sions are unevaluated (kept as symbolic trees); formally we demand that if there 
exists t G fv(E) with ~^{ti = ^ 2 ), then |A](ti) yf |A](t 2 ). 
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To deal with (2), a program where writing to a low variable under high 
guard does not immediately enable an observer to determine the value of the 
high variable, we tag each assignment statement so that an observer can detect 
which branch is taken. 

Finally, we must deal with (3), a program where there cannot be two different 
final values of 1. There seems to be no simple way to fix this, except to rule out 
loops, thus in effect considering only programs with a fixed bound on run-time 
(since for such, a loop can be unfolded repeatedly and eventually replaced by a 
sequence of conditionals; this is how we handle loops with low guard) . Remember 
(cf. Section 5) that a program deemed secure by our logic may not be really secure 
if non-termination can be observed; similarly a program deemed insecure may 
not be really insecure if non-termination cannot be observed. 

Even with the above modifications, the existence of a witness is not amenable 
to a compositional proof. For consider the program x := Ei(h) ;l := E 2 {x) where 
El and E^ are some expressions. Inductively, on the assignment to I, we can 
find two different values for x, vi and V 2 , such that the resulting values of I 
are different. But we then need an extremely strong property concerning the 
assignment to x: that there exists two different values of h such that evaluating 
Ei{h) wrt. these values produces v\, respectively V 2 - 

Instead, we shall settle for a result which says that all pairs of different initial 
values for h are witnesses, in that the resulting values of I are different. Of course, 
we need to introduce some extra assumptions to establish this stronger property. 
For example, consider the program if /i = 0 then I := 17 else I := 7 where two 
different values of h, say 3 and 4, may cause the same branch to be taken. To 
deal with that, our result must say that for every two values of h there exists 
an interpretation of true? such that wrt. that interpretation, different values of I 
result. In the above, we might stipulate that true?(3 = 0) but not true?(4 = 0). 
It turns out to be convenient to let that interpretation depend on the guard in 
question; hence we shall also tag guards so as to distinguish between different 
occurrences of the same guard. 

We thus end up with a semantics \C\x parametrized wrt. an interpretation 
X; the full development is in [2] where the following result is proved: 

Theorem 7. Assume that sp(0, C, T#) = Tf , with [x^h] G T# for x ^ h and 

with [Iffh] ^ . Further assume that ^{ti = t 2 ), with the tags of t\ and t 2 

being disjoint from the tags in C . 

Then there exists an interpretation X such that ^(|C']x(ti) = |C']x(t 2 ))- 

10 Discussion 

Perspective. This paper specifies an information flow analysis for confidentiality 
using a Hoare-like logic and considers an application of the logic to explaining 
insecurity in simple imperative programs. Program traces, potentially infinitely 
many, are abstracted by finite sets of variable independences. These variable 
independences can be statically computed using strongest postconditions, and 
can be statically checked against the logic. 
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Giacobazzi and Mastroeni [12] consider attackers as abstract interpretations 
and generalize the notion of noninterference by parameterizing it wrt. what an 
attacker can analyze about the input/output information flow. For instance, 
assume an attacker can only analyze the parity (odd/even) of values. Then 

while h Ao I ■= I -\- 2 \ h ■= h — 1 

is secure, although it contains an update of a low variable under a high guard. 
We might try to model this approach in our framework by parameterizing Defi- 
nition 1 wrt. parity, but it is not clear how to alter the proof rules accordingly. 
Instead, we envision our logic to be put on top of abstract interpretations. In 
the above example, the program would be abstracted to while h do h := h — 1 
which our logic already deems secure. 

Related Work. Perhaps the most closely related work is the one of Clark, Han- 
kin, and Hunt [6] , who consider a language similar to ours and then extend it to 
Idealized Algol, requiring distinguishing between identifiers and locations. The 
analysis for Idealized Algol is split in two stages: the first stage does a control- 
flow analysis, specified using a flow logic [20] . The second stage specifies what is 
an acceptable information flow analysis with respect to the control-flow analysis. 
The precision of the control-flow analysis influences the precision of the infor- 
mation flow analysis. Flow logics usually do not come with a frame rule so it is 
unclear what modularity properties their analysis satisfies. For each statement 
S in the program, they compute the set of dependences introduced by S'; a pair 
(x, y) is in that set if different values for y prior to execution of S may result in 
different values for x after execution of S. For a complete program, they thus, 
as expected, compute essentially the same information as we do, but the infor- 
mation computed locally is different from ours: we estimate if different initial 
values of y, i.e., values of y prior to execution of the whole program, may result 
in different values for x after execution of S. Unlike our approach, their analysis 
is termination-sensitive. 

To make our logic termination-sentitive, we could (analogous in spirit to [6]) 
define [T ^ ic] to mean that if two tuples of initial values are equal except for on 
w, then either both tuples give rise to terminating computations, or both tuples 
give rise to infinite computations. For instance, if 

h {Tjf'} while a;>7doa;:=a;-|-l {T#} 

and [a: # h] does not belong to then [T # h] should not belong to T# (neither 
of any subsequent assertion), since different values of h may result in different 
values of x and hence of different termination properties. To prove semantic 
correctness for the revised logic we would need to also revise our semantics, 
since currently it does not facilitate reasoning about infinite computations. 

Joshi and Leino [18] provide an elegant semantic characterization of non- 
interference that allows handling both termination-sensitive and termination- 
insensitive noninterference. Their notion of security for a command C is equa- 
tionally characterized by C -,HH = HH ;C;HH, where HH means that an 
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arbitrary value is assigned to a high variable. They show how to express their 
notion of security in Dijkstra’s weakest precondition calculus. Although they do 
not consider synthesizing loop invariants, this can certainly be done via a fixpoint 
computation with weakest preconditions. However, their work is not concerned 
with computing dependences, nor do they consider generating counterexamples. 

Darvas, Hahnle and Sands [10] use dynamic logic to express secure informa- 
tion flow in JavaCard. They discuss several ways that noninterference can be 
expressed in a program logic, one of which is as follows: consider a program with 
variables I and h. Consider another copy of the program with I, h relabeled to 
fresh variables V, h' respectively. Then, noninterference holds in the following 
situation: running the original program and the copy sequentially such that the 
initial state satisfies I = I' should yield a final state satisfying I = I' . Like us, 
they are interested in showing insecurity by exhibiting distinct initial values for 
high variables that give distinct current values of low variables; unlike us, they 
look at actual runtime values. To achieve this accuracy, they need the power of 
a general purpose theorem prover, which is also helpful in that they can express 
declassification, as well as treat exceptions (which most approaches based on 
static analysis cannot easily be extended to deal with). 

Barthe, D’Argenio and Rezk [5] use the same idea of self-composition (i.e., 
composing a program with a copy of itself) as Darvas et alii and investigate 
“abstract” noninterference [12] for several languages. By parameterizing non- 
interference with a property, they are able to handle more general information 
flow policies, including a form of declassification known as delimited information 
release [26] . They show how self-composition can be formulated in logics describ- 
ing these languages, namely, Hoare logic, separation logic, linear temporal logic, 
etc. They also discuss how to use their results for model checking programs 
with finite state spaces to check satisfaction of their generalized definition of 
noninterference . 

The first work that used a Hoare-style semantics to reason about information 
flow was by Andrews and Reitman [3] . Their assertions keep track of the security 
level of variables, and are able to deal even with parallel programs. However, no 
formal correctness result is stated. 

Conclusion. This paper was inspired in part by presentations by Roberto Gia- 
cobazzi and Reiner Hahnle at the Dagstuhl Seminar on Language-based Security 
in October 2003. The reported work is only the first step in our goal to formulate 
more general definitions of noninterference in terms of program (in)dependence, 
such that the definitions support modular reasoning. One direction to consider 
is to repeat the work in this paper for a richer language, with methods, pointers, 
objects and dynamic memory allocation; an obvious goal here is interprocedural 
reasoning about variable independences perhaps using a higher-order version of 
the frame rule [22]. Hahnle’s Dagstuhl presentation inspired us to look at ex- 
plaining insecurity by showing counterexamples. We plan to experiment with 
model checkers supporting linear arithmetic, for example BLAST [15], to (i) 
establish independences that our logic cannot find (cf. the false positives from 
Sect. 9); (ii) provide “genuine” counterexamples that are counterexamples wrt. 
the original semantics. 
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Abstract. The race condition checker rccjava uses a formal type sys- 
tem to statically identify potential race conditions in concurrent Java 
programs, but it requires programmer-supplied type annotations. This 
paper describes a type inference algorithm for rccjava. Due to the in- 
teraction of parameterized classes and dependent types, this type in- 
ference problem is NP-complete. This complexity result motivates our 
new approach to type inference, which is via reduction to propositional 
satisfiability. This paper describes our type inference algorithm and its 
performance on programs of up to 30,000 lines of code. 

1 Introduction 

A race condition occurs when two threads in a concurrent program manipu- 
late a shared data structure simultaneously, without synchronization. Errors 
caused by race conditions are notoriously hard to catch using testing because 
they are scheduling dependent and difficult to reproduce. Typically, program- 
mers attempt to avoid race conditions by adopting a programming discipline in 
which shared variables are protected by locks. 

In a previous paper [10], we described a static analysis tool called rccjava 
that enforces this lock-based synchronization discipline. The analysis performed 
by rccjava is formalized as a type system, and it incorporates features such 
as dependent types (where the type of a field describes the lock protecting it) 
and parameterized classes (where fields in different instances of a class can be 
protected by different locks). 

Our previous evaluation of rccjava indicates that it is effective for catching 
race conditions. However, rccjava relies on programmer-inserted type annota- 
tions that describe the locking discipline, such as which lock protects a particular 
field. The need for these type annotations limits rcc java’s applicability to large, 
legacy systems. Hence, to achieve practical static race detection for large pro- 
grams, annotation inference techniques are necessary. 

In previous work along these lines, we developed Houdini/rcc [11], a type in- 
ference algorithm for rcc j ava that heuristically generates a large set of candidate 
type annotations and then iteratively removes all invalid annotations. However, 
this approach could not handle parameterized classes or methods, which limits 
its ability to handle many of the synchronization idioms of real programs. 

* This work was supported in part by the National Science Foundation under Grants 
CCR-0341179 and CCR-0341387. 
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In the presence of parameterized classes, the type inference problem for 
rccjava is NP-complete, meaning that any type inference algorithm will have 
an exponential worst-case behavior. This complexity result motivates our new 
approach to type inference, which is via reduction to propositional satisfiability. 
That is, given an unannotated (or partially-annotated) program, we translate 
this program into a propositional formula that is satisfiable if and only if the 
original program is typeable. Moreover, after computing a satisfying assignment 
for the generated formula, we translate this assignment into appropriate annota- 
tions for the program, yielding a valid, explicitly-typed program. This approach 
works well in practice, and we report on its performance on programs of up to 
30,000 lines of code. 

Producing a small number of meaningful error messages for erroneous or 
untypeable programs is often challenging. We tackle this aspect of type inference 
by generating a weighted MAX-SAT problem [4] and producing error messages 
for the unsatisfied clauses in the optimal solution. Our experience shows that 
the resulting warnings often correspond to errors in the original program, such 
as accessing a field without holding the appropriate lock. 

We have implemented our algorithm in the Rcc/Sat tool for multithreaded 
Java programs. Experiments on benchmark programs demonstrate that it is 
effective at inferring valid type annotations for multithreaded code. The algo- 
rithm’s precision is significantly improved by performing a number of standard 
analyses, such as control-fiow and escape analysis, prior to type checking. 

The key contributions of this paper include: 

— a type inference algorithm based on reduction to propositional satisfiability; 

— a refinement of this approach to generate useful error messages via reduction 
to weighted MAX-SAT; and 

— experimental results that validate the effectiveness of this approach. 

The annotations constructed by Rcc/Sat also provide valuable documentation 
to the programmer; facilitate checking other properties such as atomicity [16, 
15, 12]; and can help reduce state explosion in model checkers [24,25, 14,9]. 

2 Types Against Races 

2.1 Type Checking 

This section introduces RFj2, an idealized multithreaded subset of Java with a 
type system that guarantees race freedom for well- typed programs. This type 
system extends our previous work on the rccjava type system [10], for example 
with parameterized methods. To clarify our presentation, rfj2 also simplifies 
some aspects of rccjava. For example, it does not support inheritance. (Inher- 
itance and other aspects of the full Java programming language are dealt with 
in our implementation, described in Section 4.) 

An rfj2 program (see Figure 1) is a sequence of class declarations together 
with an initial expression. Each class declaration associates a class name with 
a body that consists of a sequence of field and method declarations. The self- 
reference variable “this” is implicitly bound within the class body. 




118 Cormac Flanagan and Stephen N. Freund 



P ::= defn* e 


(program) 


defn ::= class cn(ghost x*} { field* meth* } 


(class declaration) 


field t fn guarded_by 1 


(field declaration) 


meth t mn(ghost x*){arg*) requires s { e } 


(method declaration) 


arg ::= t x 


(argument declaration) 


c, t cn{l*) 


(type) 


/ ::= a; 1 a 1 1 ■ 9 


(lock expression) 


s ::= 0 1 {/} 1 s U s 1 /3 1 •s • ^ 


(lock set expression) 


9 .. — . — l\ , . . . , Xn . — ln\ 


(substitution) 


e,f ::= x \ null | new c(e*) | e.fn \ e.fn = e \ e.mn(l*) (e*) 


(expressions) 


1 let a: = e in e 1 synchronized x e | e.fork 




a € LockVar x,y (z Var 


fn € FieldName 


l3 e LockSetVar cn e ClassName 


mn e MethodName 



Fig. 1. The idealized language rfj2. 



The RFj2 language includes type annotations that specify the locking dis- 
cipline. For example, the type annotation guarded_by a; on a field declaration 
states that the lock denoted by the variable x must be held whenever that field 
is accessed (read or written). Similarly, the type annotation requires ... ,x„ 
on a method declaration states that these locks are held on method entry; the 
type system verifies that these locks are indeed held at each call-site of the 
method, and checks that the method body is race-free given this assumption. 

The language provides parameterized classes, to allow the fields of a class to 
be protected by some lock external to the class. A parameterized class declaration 

class cn(ghost Xi . . . Xn) { ■ • ■ } 

introduces a binding for the ghost variables a;i . . .a;„, which can be referred to 
from type annotations within the class body. The type cn(yi . . . yn) refers to an 
instantiated version of cn, where each Xi in the body is replaced by yi. As an 
example, the type Hashtable(?/i, 2/2) may denote a hashtable that is protected 
by lock yi, where each element of the hashtable is protected by lock j/2- 

The RFj2 language also supports parameterized method declarations, such as 

t m(ghost x)(.cn{x) y) requires a; { ... } 

which defines a method m that is parameterized by lock x, and which takes an 
argument of type cn{x). A corresponding invocation e.m(z)(e') must supply a 
ghost argument z and an actual parameter e' of type cn{z). 

Expressions include object allocation new c(e*), which initializes a new ob- 
ject’s fields with its argument values; field read and update; method invocation; 
and variable binding and reference. The expression synchronized a; e is evalu- 
ated in a manner similar to Java’s synchronized statement: the lock for object 
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(a) Example Program Ref 



class Lock() { } 
class Ref (ghost x) { 
int y guarded.by cxi 

boolean lessThan(Ref (a2) o) requires /3 { 
this.y < o.y ; 

} 

} 



let lock = new Lock() () ; 
rl = new Ref(a:3)(l); 
r2 = new Ref (0:4) (2) 
in synchronized (lock) { 
rl . lessThcin(r2) ; 

} 



(b) Constraints 



Oil G 
0:2 ^ 
/5 C 

03 G 

04 G 

oi G 

oi[this := o,x := 02] G 
/ 3 [this := rl, x := 03, o := r 2 ] C 
02[this := rl, x := 03, o := r2] — 



{ this, X } 

{ this, X } 

{ this, X, o } 
{ lock } 

{ lock, rl } 

/3 

(3 

{lock} 

04 



(c) Conditional Assignment 



decl. of y 
decl. of lessThcin 
decl. of lessThcin 
first new expr. 
second new expr. 

access to this.y 
access to o ,y 
requires for call 
arg. type for call 



Y(ai) 

Y(«2) 

Yim 

Yias) 

Y(ai) 



(bi ?this : x) 

(b2?this : x) 

(54?this : 0) U : 0) U {bQ?o 

lock 

(b3?lock : rl) 



decl. of y 
decl. of lessThcin 
0) decl. of lessThcin 
first new expr. 
second new expr. 



(d) Boolean Constraints 



(h4?rl 



(bl? 

0) U (b5?lock 



(61 ?this : x) G 
) : (b2?this : x)) G 
0) U (fce?r2 : 0) C 
(b2?i^l • lock) = 



( 64 ?this : 0) U (6s?x : 0) U {bQ?o 
(b4?this : 0) U (b5?x : 0) U {bQ?o 
{lock} 

(63?lock : rl) 



0) access to this.y 
0) access to o ,y 
requires for call 
arg. type for call 



(e) Boolean Formula 



[(bi A b4) V (^bi A bs)] 

A [(bi A be) V (^bi A ((b2 A b4) V (^b2 A be)))] 
A [^64 A ^be] 

A [(b2 A ^bs) V (^b2 A bs)] 



access to this .y 
access to o . y 
requires for call 
arg. type for call 



Fig. 2. Example program and type inference constraints. 



X is acquired, the subexpression e is then evaluated, and finally the lock is re- 
leased. The expression e.fork starts a new thread. Here, e should evaluate to 
an object that includes a nullary method run. The fork operation spawns a new 
thread that calls that run method. 

The RFj2 type system leverages parameterized methods to reason about 
thread-local data. (This approach replaces the escape by analysis embedded in 
our earlier type system [10].) Specifically, the run method of each forked thread 
takes a ghost parameter tl_lock denoting a thread-local lock that is always held 
by that thread: 



t run(ghost tl_lock)() requires tl_lock { e } 
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Intuitively, the underlying run-time system creates and acquires this thread-local 
lock when a new thread is created. This lock may be used to guard thread-local 
data and may be passed as a ghost parameter to other methods that access 
thread- local data. In a similar fashion, we also introduce an implicit, globally- 
visible lock called main_lock, which is held by the initial program thread and 
can be used to protect data exclusively accessed by that thread. 

2.2 Type Inference 

Our previous evaluation of the race-free type system rccjava indicates that it 
is effective for catching race conditions [10]. However, the need for programmer- 
inserted annotations limits its applicability to large, legacy systems, which mo- 
tivates the development of type inference techniques for race- free type systems. 

In this paper we describe a novel type inference system for rfj2. We intro- 
duce lock variables a and lockset variables /?, collectively referred to as lock- 
ing variables. Locking variables may be mentioned in type annotations, as in 
guarded_by a, requires /?, or cn(o;i, 02 )- During type inference, each lock vari- 
able a is resolved to some specific program variable in scope, and each lock set 
variable j3 is resolved to some set of program variables in scope. As an exam- 
ple, Figure 2(a) presents a simple reference cell implementation, written in rfj2 
extended with primitive types and operations, that contains locking variables. 

An rfj2 program is explicitly-typed if it contains no locking variables. The 
type inference problem is, given a program with locking variables, to resolve these 
locking variables so that the resulting explicitly-typed program is well-typed. 

Parameterized classes introduce substitutions that complicate the type infer- 
ence problem. We use the notation [x\ := h,. . . ,Xn ■= In] to denote a substi- 
tution 6 that replaces each program variable Xi with the lock expression li. To 
illustrate the need for these substitutions, consider the class declaration: 

class cn (ghost x) { t fn guarded_by 1; } 

If a variable p has type cn{y), then the field p.fn is protected by 0{l), where 
the substitution 6 = [x := y] replaces the formal ghost parameter x by the 
actual parameter y. The application of a substitution to most syntactic entities is 
straightforward; however, the application of a substitution 0 to a lock expression 
I is delayed until any lock variables a in the lock expression are resolved. We 
use the syntax I ■ 0 to represent this delayed substitution. Similarly, if the lock 
set expression s denote the set of locks in a method’s requires clause, then the 
application of a substitution 0 to s yields the delayed substitution s ■ 9. The 
following examples illustrate substitutions on various syntactic entities. (Due to 
space limitations, we do not present an exhaustive definition.) 

9{x) = I if 0 =[..., a; :=/,.. .] 

0 (synchronized x e) = synchronized 6{x) 0(e) 

0(0 = 1-6 
0(s) = s-0 

Since the type rules reason about delayed substitutions, we include these de- 
layed substitutions in the programming language syntax, but we require that 
substitutions do not appear in source programs. 
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The type rules for rfj 2 generate a collection of constraints that contain 
delayed substitutions. These constraints include equality constraints between 
lock expressions and containment constraints between lock set expressions: 

C ::= s C s \ I ^ I 

The core of the type system is defined by the judgment: 

P;E;sh e:t Sz C 

Here, the program P is included to provide access to class declarations; E is an 
environment providing types for the free variables of the expression e; the lock 
set s describes the locks held when executing e; t is the type inferred for e; and 
C is the generated set of constraints. 

Most of the type rules are straightforward. The complete set of type judg- 
ments and rules is contained in Appendix A. Here we briefly explain two of the 
more crucial rules. The rule for synchronized x e checks e with an extended 
lock set that includes x, since the lock x is held when evaluating e. The rule 
for e.fn checks that e is a well- typed expression of some class type cn(Zi,,„) and 
that cn has a held fn of type t, guarded by lock 1. 

P-E\s\- e-. cn(Zi.,„) & C 

P\ E\ s \- X : t' Sz C class cn(ghost x\,_n) {■ ■ -t fn guard.ed._by I . . ^ P 

P\ E\s\J {x} \- e •. t Sz C' 9 — [this e, Xj Ij 

P; E; s \- synchronized x e : t Sz (C U C') P; E \- 9{t) Sz C' 

P-E-sh e.fn : 9{t) & (C U C' U {9(1) G s}) 

Since the protecting lock expression I (and type t) may refer to the ghost 
parameters Xi..„ and the implicitly-bound self-reference this, neither of which 
are in scope at the held access, we introduce the substitution 0 which substi- 
tutes appropriate expressions for these variables. The constraint 9{l) G s, an 
abbreviation for {0(Z)} C s, ensures that the substituted lock expression is in 
the current lock set. The type of the held dereference is computed by applying 
the substitution 6 to the held type t, which must yield a well-formed type. 

The type system defines the top-level judgment P \~ C, where C is the 
generated set of constraints for the program P. Applying these type rules to the 
example program Ref of Figure 2(a) yields the constraints shown in Figure 2(b). 
(We ignore main_lock in this example for simplicity). 

We next address the question of when the generated constraints over the 
locking variables are satisflable. An assignment 

A : {LockVar Var) U {LockSetVar 2^®’’) 

resolves lock and lock set variables to corresponding program variables and sets of 
program variables, respectively. We extend assignments to lock expressions, lock 
set expressions, and substitutions. In particular, since an assignment resolves all 
locking variables, any delayed substitutions can be immediately performed. 

A: I ^ Var A : s ^ 2 A : 9 ^ 0 

A{x) = X A(0) = 0 A{[xi ■— h, . . . ,x„ ■■= In]) = 

A{1 ■ 9) = A(9)(A(l)) A{{1}) = {A{1)} [xi ~ A{h), . . . ,Xn ~ A{ln)] 

A(si U S 2 ) = A(si) U A(s 2 ) 

A(s-9) = A(0)(A(s)) 
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We also extend assignments in a compatible manner to other syntactic units, 
such as constraints, expressions, programs, etc. 

An assignment A satisfies a constraint C (written A |= C) as follows: 

A 1= Si C S 2 iff A(si) C A(s 2 ) 

A\=h = h iff A(h) = A{h) 

If A 1= C for all C G C then we say A is a solution for C, written A\= C. A set 
of constraints C is valid, written \= C, if every assignment is a solution for C. 
For example, the constraints of Figure 2(b) for the program Ref are satisfied by 
the assignment: = «2 = x, 03 = 0:4 = lock, and f3 = {x}. 

We say P is well-typed \i P \- C and the constraints C are satisfiable. If a 
solution A for the constraints C exists, the following theorem states that the 
explicitly-typed program A{P) is well-typed. (Proofs for the theorems in this 
paper appear in an extended report [13].) 

Theorem 1. If P \- C and A\= C then A{P) h A(C) and ^ A(C). 

For explicitly-typed programs, since the generated constraints C do not con- 
tain locking variables, checking the satisfiability of C is straightforward. In the 
more general case where P is not explicitly- typed, the type inference problem 
involves searching for a solution A for the generated constraints C. Due to the 
interaction between parameterized classes and dependent types, the type infer- 
ence problem for rfj2 (and similarly for rccjava) is NP-complete. (The proof 
is via a reduction from propositional satisfiability.) 

Theorem 2. For an arbitrary program P, the problem of finding an assignment 
A such that A{P) is explicitly-typed and A{P) h C and \= C is NP-complete. 

Despite this worst-case complexity result, we demonstrate a technique in the 
next section that has proven effective in practice. 

3 Solving Constraint Systems 

3.1 Generating Boolean Constraints 

For each lock variable a mentioned in the program, the type rules introduce a 
scope constraint a G {xi, . . . ,Xn} that constrains a to be one of the variables 
xi,...,Xn in scope. A similar constraint /3 C {xi, . . . ,Xn} is introduced for 
each lock set variable (3. These scope constraints specify the possible choices for 
each locking variable, and enable us to translate each constraint C over locking 
variables into a Boolean constraint D that uses Boolean variables to encode the 
possible choices for each locking variable. The notation b? X : Y denotes X if 
the Boolean variable b is true, and denotes Y otherwise. 

D ::= S S \ L = L (Boolean constraints) 

L ::= X \ b1 L : L (conditional lock expressions) 

S ::= 0 I {L} \ b? S : S \ S U S (conditional lock set expressions) 

b G BoolVar (Boolean variables) 
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From the scope constraints, we generate a conditional assignment 
Y : {LockVar — > i) U {LockSetVar — > S) 



that encodes the possible choices for each locking variable. For example, the 
scope constraints a € {x\, . . . , Xn} and /? C {yi, . . . , yield: 

Y (a) = bilxi : {b2?X2 : (• • • b„-i 7 xn-i ■ x „) . . .) 

Y(/3) = (6'i?{yi} : 0) U • • • U : 0) 



where each Boolean variable bi and b[ is fresh^. 

We extend the conditional assignment to translate each constraint C to a 
Boolean constraint D = Y{C), and to translate lock expressions, lock set ex- 
pressions, and substitutions, as follows. Since the conditional assignment (con- 
ditionally) resolves locking variables, as part of this translation we immediately 
apply any delayed substitutions, to yield a substitution-free Boolean constraint: 



Y -.C ^ D 


Y -.s ^ S 


y(siCs2) = T(si)cy(s2) 


y( 0 ) = 0 


Y{h = h) = Y{h) = Y(h) 


Y{{1}) = {Y{1)} 




Y{siUS2) = Y{si)UY{s2) 




Y{s-e) = Y{e){Y{s)) 


Y -.1 ^ L 




Y{x) = X 


y([xi . — /l,..., Xn • — ) — 


Y(i-e) = Y{e){Y(i)) 


[Xi :=Y{h),...,Xn := Y{ln)] 



Figure 2 (c) and (d) show the conditional assignment and Boolean constraints 



for the example program Ref. 

A truth assignment B : BoolVar Boolean assigns truth values to Boolean 
variables. We extend truth assignments to L and 5 in a straightforward manner: 



B : L 
B{x) 

B{b?Li : L2) 



Var 

X 

I B(Li) if B(b) 

\ B{L2) if -^B{b) 



B : S 
B(0) 
B{{L}) 



B{b?Si : S2) 
B{Si U S2) 



2 Var 

0 

{B{L)} 

( B{Si) if B{b) 

\ B{S2) if -^B{b) 
B{Si)uB{S 2 ) 



A truth assignment B satisfies a set of Boolean constraints D if B \= D for each 
D G D, where: 

B 1= Si C 52 iff B{Si) C B{S2) 

B\= Li = L 2 iff B{Li) = B{L2) 

For example, the Boolean constraints of Figure 2 (d) are satisfied by the following 
truth assignment: 61 = 62 = ^4 = = false and 63 = 65 = true. 

The application of a truth assignment B to a conditional assignment Y yields 
the (unconditional) assignment B(Y), defined as B{Y){x) = B{Y{x)). 

The translation from constraints to Boolean constraints is semantics-preser- 
ving, in the sense that if the generated Boolean constraints are satisfiable, then 
the original constraints are also satisfiable. 

^ We could encode the choice for the first constraint as a decision tree with only log n 
Boolean variables. 
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Theorem 3. Suppose D = Y{C) and let B he a truth assignment. Then 
B{Y) \= C if and only if B \= D. 



3.2 Solving Boolean Constraints 

The final step is to find a truth assignment B satisfying the generated Boolean 
constraints D. We accomplish this step by translating D into a Boolean formula 
F, which can then be solved by a standard propositional satisfiability solver such 
as Chaff [21]. The Boolean formula syntax and this translation are as follows: 



F ::= true | false \ b\ F\/F\FAF \ -^F 

U-.D^f 

p] = 



\x = x\ = true 
\x = — false ii x ^ y 

ti = (fe?Li :L2)1 = (6 a|L = Li1) 

V(-.6A|L = L2l) 
[(fe?Li : L 2 ) = Lj = lL = (bill : L2)} 
[0 C S'] = true 
I(Si u S 2 ) C S| = [Si C S| 
a[S 2 c S] 



F 

[( 6 ?Si : S2) C S| = (&A [Si c S|) 

V(^&A [S2 C S|) 
[{L} c 0| = false 

[{L}C(&?Si:S2)1 = (&A[{L}CSi]) 

Vh&A[{L} CS2I) 
{{L} C (Si U S 2 )l = 1{L} C Si| 

V[{L} C S 2 I 

[{Cl}C{L2}| = [il=b2l 



Figure 2(e) presents the formulas for the four constraints from our example 
program. This translation is semantics preserving with respect to the standard 
notion of satisfiability B \= F for Boolean formulas. 

Theorem 4. If F = |L>] then for all B, B \= F if and only if B \= D. 

In summary, our type inference algorithm proceeds as follows: Given a pro- 
gram P with locking variables, we generate from P a collection of constraints 
C over the locking variables; we extract a conditional assignment Y from C 
and generate Boolean constraints D = Y(C); and we generate a corresponding 
Boolean formula F = |l)|. We use a propositional satisfiability solver to de- 
termine a truth assignment B for F, in which case we also have that B \= D 
by Theorem 4 and (B{Y)) \= C hy Theorem 3, and therefore the explicitly- 
typed program (B{Y)){P) is well-typed. Conversely, if the generated formula F 
is unsatisfiable, then there is no assignment A such that A{P) is well-typed. 



4 Implementation 

We have implemented our inference algorithm in the Rcc/Sat checker, which 
supports the full Java programming language (although it does not currently 
detect race conditions on array accesses). Rcc/Sat takes as input an unannotated 
or partially-annotated program, where any typing annotations are provided in 
comments starting with “#”, as in /*# guarded_by y */. 
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Rcc/Sat first adds a predetermined number of ghost parameters to all classes 
and methods lacking user-specified parameters. Next, for each unguarded field, 
Rcc/Sat adds the annotation guarded_by a, where a is fresh. Rcc/Sat also uses 
fresh locking variables to add any missing requires annotations and class and 
method instantiation parameters. Rcc/Sat then performs our type inference al- 
gorithm. If the generated constraints are satisfiable, then the satisfying assign- 
ment is used to generate an explicitly-typed version of the program. Section 4.2 
outlines how we generate meaningful error messages when they are not. 

4.1 Java Features 

We handle additional features of the Java programming language as follows. 
Scope Constraints. Rcc/Sat permits lock expressions to be any final object 
references, including: (1) this; (2) ghost parameters; (3) final variables, static 
fields, and parameters; and (4) well-typed expressions of the form e./, where e 
is a constant expression and / is a final field. This set may be infinite, and we 
heuristically limit it to expressions with at most two field accesses. 
Inheritance, Subtyping, and Interfaces. Given the declaration 

class C(ghost ai a„) extends D(ghost bi,...,bfc) { ... } 

we consider the type instantiation C(?i „) to be an immediate subtype of D{mi,,k) 
provided = hi[&j := Ij for all i G l..k. The subtyping relation is the re- 

flexive and transitive closure of this rule. The signature of an overriding method 
must match that of the overridden form, after applying the type parameter sub- 
stitutions induced by the inheritance hierarchy. Interfaces are handled similarly. 
Inner Classes. Non-static inner classes may access the type parameters from 
the enclosing class and may declare their own parameters. Thus, the complete 
type for such a class is Duter(/i,,„) . Inner(TOi,,fc). 

Static Fields, Methods, and Inner Classes. Static members may not refer 
to the enclosing class’ type parameters since static members are not associated 
with a specific instantiation of the class. 

Thread Objects. To allow Thread objects to store thread-local data in their 
fields, Rcc/Sat adds an implicit final field tl_lock to each Thread class. This 
field is analogous to (and replaces) the ghost parameter on the run method in 
rfj2. It may guard other fields and is assumed to be held when run is invoked. 
Escape Mechanisms. We provide escapes from the RFj2 type system through 
a “no_warn” annotation that suppresses the generation of constraints for a line of 
code. Also, since ghost parameters are erased at run time, the ghost parameters 
in typecasts of the form (C(a))x are not checked dynamically. 

4.2 Reporting Errors 

We introduce two important improvements that enable the tool to pinpoint likely 
errors in the program when the generated constraints are unsatisfiable. 

First, we change the algorithm to check each field declaration in a program 
separately, thereby enabling us to distinguish fields with potential races from 
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those that are race-free. To check a single field, we generate the constraints as 
before, except that we only add field access constraints for accesses to the field 
of interest. The analysis is compositional in this manner because the presence 
or absence of races on one field is independence of races on other fields. 

There is a possibility that the same locking variable will be assigned different 
values when checking different fields. If this occurs, we can compose the results 
of the separate checks together by introducing additional type parameters and 
renaming locking variables as necessary. For example, if a type instantiation C(a) 
of class C(ghost x) becomes C(ll) when checking one field of C and C(l2) when 
checking another, we can change the class declaration to C(ghost xl, x2), and 
instantiate it as C(ll,12) at the conflicting location. 

Second, when there are race conditions on a field, it is often desirable to infer 
the most likely lock protecting it and then generate errors for locations where 
that lock is not held. For example, the following program is not well-typed: 



1: class C(ghost y) { 

2 : int c guarded_by a ; 

3: void fl() requires y { c = 1; } 

4: void f2() requires y { c = 2; } 

5: void f3() requires this { c = 3; } 

6 : } 

Our tool produces the following diagnostic message at the likely error site: 
C.java:5: Lock ’y’ not held on access to ’c’. Locks held: •[ this I. 



To pinpoint likely error locations in this way, we express type inference as an 
optimization problem instead of a satisfiability problem. First, we add weights 
to some of the generated constraints, as follows. A constraint C with weight w 
is written as the weighted constraint W = C\w 



Ce G {y, this, no_lock} 
a € {y,this} I 2 
a e {y, no_lock} |i 
ct G {y, no_lock} |i 
a G {this, no_lock} |i 



Scope constraint for c 

Requirement that c is guarded by a valid lock 
Access constraint for c from fl 
Access constraint for c from f2 
Access constraint for c from f3 



These five constraints refer to no_lock, a lock name used in the checker to indi- 
cate that no reasonable guarding lock can be found for a field. Given constraints 
C and weighted constraints W, we compute the optimal assignment A such that: 

1. A \=C for all C € C, and 

2. the sum {w | Cjuj € W A A \= C} is maximized. 

Note that we do not require all constraints in W be satisfied by A. For the 
constraints above, A is the assignment a = y, with a value of 4. We then generate 
error messages for all constraints in W that are not satisfied by A. The constraint 
a G {this, no_lock} |i is not satisfied by the optimal assignment A, yielding 
the above error message. Conversely, if the optimal assignment A did not satisfy 
the constraint a G {y, this} I 2 , then we would generate the error message: 



C.java:2: No consistent guarding lock for field ’c’. 
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We have found that the heuristic of weighting declaration constraints 2-4 times 
more than field access constraints works well in practice. 

We solve the constraint optimization problem for W and C by translating 
the constraints into a weighted MAX-SAT problem and solving it with the PBS 
tool [4]. The translation is similar to the case without weights. PBS and similar 
tools can find optimal assignments for formulas including up to 50-100 weighted 
clauses. Optimizing over a larger number of weighted clauses is currently compu- 
tationally intractable. Thus, we still check one field at a time and only optimize 
over constraints generated by field accesses, placing all constraints for requires 
clauses and type equality in C. If C is not satisfiable, we forego the optimiza- 
tion step and instead generate error messages for constraints in the smallest 
unsatisfiable core of C, which we find with Chaff [21]. 

4.3 Improving Precision 

Rcc/Sat implements a somewhat more expressive type system than that de- 
scribed in Section 2 to handle the synchronization patterns of large programs 
more effectively. In particular: 

— Unreachable code is not type checked. 

— Read-shared fields do not need guarding locks. A read- shared field is a field 
that is initialized while local to its creating thread, and subsequently shared 
in read-only mode among multiple threads. 

— A field’s protecting lock need not be held for accesses occurring when only a 
single thread exists or when the object has not escaped its creating thread. 

Programs typically relax the core lock-based synchronization discipline along 
these lines. The checker currently uses quite basic implementations of rapid type 
analysis [5], escape analysis [6], and control-flow analysis for this step. Using 
more precise analyses would further improve our type inference algorithm. 

5 Evaluation 

We applied Rcc/Sat to benchmark programs including elevator, a discrete 
event simulator [28]; tsp, a Traveling Salesman Problem solver [28]; sor, a sci- 
entific computing program [28] ; the mtrt ray-tracing program and j bb business 
objects simulator benchmarks [23]; and the moldyn, montecarlo, and raytracer 
benchmarks [20]. We ran these experiments on a 3.06GHz Pentium 4 processor 
with 2GB of memory, with Rcc/Sat configured to insert one ghost parameter on 
classes, interfaces, and instance methods and two parameters on static methods. 

Table 1 shows, for each benchmark, the size in lines of code, the overall 
time for type inference, and the average type inference time per field. It also 
shows the size of the constraint problem generated, in number of constraints 
and the number of variables and clauses in the resulting Boolean formula, after 
conversion to GNF. The preliminary analyses described in Section 4.3 typically 
consumed less than 2% of the run time on the larger benchmarks. 
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Table 1. Summary of test program performance. 



Program 


Size 

(LOC) 


Time 

(s) 


Time/ 

Field 

(s) 


Number of 
Constraints 


Formula Size 


Manual 

Annot. 


Fields 


Total 


read- 

shared 


race- 

free 


no 

guard 


vars 


clauses 


elevator 


529 


5.0 


0.22 


215 


1,449 


3,831 


0 


23 


17 


6 


0 


tsp 


723 


6.9 


0.19 


233 


2,090 


7,151 


3 


37 


21 


16 


3 


sor 


687 


4.5 


0.15 


130 


562 


1,205 


1 


29 


22 


7 


0 


raytracer 


1,982 


21.0 


0.27 


801 


9,436 


29,841 


2 


77 


45 


28 


4 


moldyn 


1,408 


12.6 


0.12 


904 


4,011 


10,036 


3 


107 


57 


44 


6 


montecarlo 


3,674 


20.7 


0.19 


1,097 


9,003 


25,974 


1 


110 


68 


42 


0 


mtrt 


11,315 


138.8 


1.5 


5,636 


38,025 


123,046 


6 


181 


112 


69 


4 


jbb 


30,519 


2,773.5 


3.52 


11,698 


146,390 


549,667 


40 


787 


472 


295 


20 



The “Manual Annotations” column reflects the number of annotations man- 
ually inserted to guide the analysis. We added these few annotations to suppress 
warnings only in situations where immediately identifiable local properties en- 
sured correctness. The manual annotations were inserted, for example, to delin- 
eate single-threaded parts of the program after joining all spawned threads; to 
explicitly instantiate classes in two places where the scope constraint generation 
heuristics did not consider the appropriate locks; and to identify thread-local 
object references not found by our escape analysis. In jbb, we also added anno- 
tations to suppress spurious race-condition warnings on roughly 25 fields with 
benign races. These fields were designed to be write-protected [12], meaning that 
a lock guarded write accesses, but read accesses were not synchronized. This 
idiom is unsafe if misused but permits synchronization- free accessor methods. 

The last four columns show the total number of fields in the program, as well 
as their breakdown into read-shared fields, race-free fields, and fields for which 
no guarding lock was inferred. The analyses described in Section 4.3 reduced the 
number of fields without valid guards by 20%-75%, a significant percentage. 

Rcc/Sat identified three fields in the tsp benchmark on which there are 
intentional races. On raytracer, Rcc/Sat identified a previously known race on 
a checksum field and reported spurious warnings on three fields. It also identified 
a known race on a counter in mtrt. The remaining warnings were spurious and 
could be eliminated by additional annotations or, in some cases, by improving 
the precision of the additional analyses of Section 4.3. 

Overall, these results are quite promising. Manually inserting a small number 
of annotations enables Rcc/Sat to verify that the vast majority (92%-100%) of 
fields are race-free. These results show a substantial improvement over previous 
type inference algorithms for race-free type systems, such as Houdini/rcc. 



6 Related Work 

Boyapati and Rinard have defined a race-free type system with a notion of object 
ownership [7]. They include special owners to indicate thread- local data, thereby 
allowing a single class declaration to be used for both thread-local instances and 
shared instances, which motivated some of our refinements in rfj2. They present 
an zntraprocedural algorithm to infer ownership parameters for class instanti- 
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ations within a method. This simpler intraprocedural context yields equality 
constraints over lock variables, which can be efficiently solved using union-find. 
We believe it may be possible to extend our interprocedural type inference al- 
gorithm to accommodate ownership types. Grossman has developed a race-free 
type system for Cyclone, a statically safe variant of C [18]. Cyclone has a number 
of additional features, such as existential quantification and singleton types, and 
it remains to be seen how our techniques would apply in this setting. 

The requires annotations used in our type system essentially constrain the 
effects that the method may produce. Thus, we are performing a form of effect 
reconstruction [27, 26] , but our dependent types are not amenable to traditional 
effect reconstruction techniques. Similarly, the constraints of our type system 
do not exhibit the monotonicity properties that facilitate the polynomial time 
solvers used in other constraint-based analyses (see, for example, Aiken’s sur- 
vey [2]). Cardelli [8] was among the first to explore type checking for dependent 
types. Our dependent types are comparatively limited in expressive power, but 
the resulting type checking and type inference problems are decidable. 

Eraser [22] is a tool for detecting race conditions in unannotated programs dy- 
namically (though it may fail to detect certain errors because of insufficient test 
coverage). Agarwal and Stoller [1] present a dynamic type inference technique 
for the type system of Boyapati and Rinard. Their technique extracts locking 
information from a program trace and then performs a static analysis involving 
unique pointer analysis [3] and intraprocedural ownership inference [7] to con- 
struct annotations. These dynamic analyses complement our static approach, 
and it may be possible to leverage their results to facilitate type inference. 

A common and significant problem with many type-inference techniques is 
the inability to construct meaningful error messages when inference fails (see, 
for example, [29, 30, 19]). An interesting contribution of our approach is that we 
view type inference as an optimization problem over a set of constraints that 
attempts to produce the most reasonable error messages for a program. 

7 Conclusions 

This paper contributes a new type inference algorithm for race-free type systems, 
which is based on reduction to propositional satisfiability. Our experimental re- 
sults demonstrate that this approach works well in practice on benchmarks of up 
to 30,000 lines of code. Extending and evaluating this approach on significantly 
larger benchmarks remains an issue for future work. We also demonstrate exten- 
sions to facilitate reliable error reporting. We believe the resulting annotations 
and race-free guarantee provided by our type inference system have a wide range 
of applications in the analysis, validation, and verification of multithreaded pro- 
grams. In particular, they provide valuable documentation to the programmer, 
they facilitate checking other program properties such as atomicity, and they 
can help reduce state explosion in model checkers. 
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A Type System 

This appendix provides a complete definition of rfj2. We first informally define 

a number of predicates. (See [17] for their precise definition.) 



Predicate 


Meaning 


ClassOnce{P) 

FieldsOnce{P) 

MethodsOncePerClass(P) 


no class is declared twice in P 

no class contains two fields with the same name 

no method name appears more than once per class 



A typing environment is defined as: 
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-B ::= 0 I E,t X \ B, ghost a; 



Ph C 



ClassOnce{P) Fields Once{P) 
MethodsOncePerClass(P) 

P = defrij^ ^ e 
P h defn^ ^ Ci Vi G l..n 
P; ghost main_lock; {main_lock} h e : t & C 

P ^ U 0 



P h de/n & C 



garg^ = ghost xi Vi G l-.n 
E = gargi cn(a:i,,Ti) this 
P; P h yie/d^’ & Vi G 1 ..:/ 

P; P h methi & C'i Vi G l..fc 
C = Ci..j U C'i..fc 
P h class cn(ghost xi..„.) 

{ field-^ j methi , & (5 



P; P h u-/ & C 



P; 0 h u;/ & 0 



P; P h t & C 
X ^ dom(P) 



P; P G It;/ & C 
X ^ dom{E) 



P; P G i & C 



P; P, i X G wf Sz C P', F, ghost x G it;/ Sz C 



P; E h wf C 

class cn(ghost . . . G P 

C' = C U {h e dom(P) I 

P; P G cn{li,,r,) & C' 



P; P G field h C 



P; P G i & C 
C' = C U G dom(P)} 

P; P G i /n guarded_by ? & O’ 



P; P G meth 8 z C 



garg^ = ghost Xi Vi G l..n 
P' = P, garg^ 

P; P'; s G e : £ & C 
C' = C U {s C dom(P')} 
s is either {yi, • • • , yfc} or j 3 
P; P G £ 7nn(ghost xi.,„.) (^arg-^ requires s { e } & 



P; P; s G e : £ & C 



P; P G c & C 
P; P; s G null : c & C 



P; P G It;/ & C 
P = Pi , £ X, p2 
P; P; s G X : £ & C 



P; P; s G e : cn(^i..^) & C 
P; P G cn(/i „) & C' 

c" = cu c" u {ii = z' -eu.nj 

P; _E; s h e : cn(Z'i „) & C" 



y is fresh 

0 = [xj := Ij ■^^^""',this := y] 

P; P, cn{/i..n) y; s G e* : 0 (£i) & C* Vi G l..fc 
class cn(ghost xi,,n) { field^ ^ methi. ,rn } G P 
fieldj^ = ti fn^ guarded_by Vi G l..fc 

P; P G cn{li,,ri) & C' 
c" = Cl U C' U {Zj e dom(E) 

P‘ E‘ s G new cn(Zi..n) Cei..fc) : cti{Zi..^) & C” 



P; P; s G e : cn(Zi..„) & C 
class cn{ghost xi..n) 

{. . . £ /n guarded_by / . . .} G P 
0 = [this := e, Xj := Ij 
P; P G 6'(£) & C' 

P; P; s G e.fn : 6 »(£) & (C U C' U { 0 {Z) G s}) 



P; P; s G e : cn(Zi..„) & C 
class cn(ghost xi,,ti) 

{. . . t fn guarded_by Z . . .} G P 
0 = [this := e, Xj := Ij 
P-Ehe: 9{t) & C' 

P; P; s G e.fn = e : 6»(£) & (C U C' U (0(Z) G s}) 



P; P; s G 6i : £i & Ci 
P; P, £ x; s G 62 : £2 & C2 

0 = [x := ei] 

P; P G 6»(£2_) U C_z 

C= (CiUCaUCs) 

P; P; s G let x = ei in 02 : ^(£2) Sz C 



P; P; s G e : cn(Zi..„) & C 

class cn(ghost xi..„.) {. . . £ mn(ghost yi..fc)(£j requires s' { e' } . . .} G P 

e = [this := e,Xi := h zf ■.= e* 

P;P;s G ej : 0(tj) & Cj _ Vj G l..d 
P;_P G 6»(£) & C' 

= C U Ci..d UC' U {6>(5p C s} 

P; P; s G e.77m(Z^ j,)(ei,,d) : ^0) & C" 



P; P; s G X : £' & C _ 

P; P; s U {x} G e : £ & C' 

P; P; s G synchronized x e : t $z (C U C') 



P; P; s G e : cn(Zi..„> & C 
class cn(ghost xi,,ti) {• . . meth . . .} G P 
meth = t' run(ghost tl_lock) () requires tl_lock { } 

P; P; s G e.fork t Sz C 
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Abstract. Array-Range Analysis computes at compile time the range 
of possible index values for each array-index expression in a program. 
This information can be used to detect potential out-of-bounds array 
accesses and to identify non-aliasing array accesses. In a language like C, 
where arrays can be accessed indirectly via pointers, and where pointer 
arithmetic is allowed, range analysis must be extended to compute the 
range of possible values for each pointer dereference. 

This paper describes a Pointer-Range Analysis algorithm that computes 
a safe approximation of the set of memory locations that may be ac- 
cessed by each pointer dereference. To properly account for non-trivial 
aspects of C, including pointer arithmetic and type-casting, a range rep- 
resentation is described that separates the identity of a pointer’s target 
location from its type; this separation allows a concise representation 
of pointers to multiple arrays, and precise handling of mismatched-type 
pointer arithmetic. 



1 Introduction 

The goal of Array- Range Analysis is to compute (at compile time) the range of 
possible index values for each array-index expression in a program. This infor- 
mation can be used for many applications, such as: 

• Eliminating unnecessary or redundant bounds-checking operations, for code 
optimization [13,26]; 

• Detecting potential out-of-bounds access errors, for debugging, program ver- 
ification, or security [25, 17]; 

• Identifying non-aliasing array accesses, for program understanding, opti- 
mization, or parallelization [1,24, 14]. 

The importance of Array-Range Analysis is reflected in the extensive body of 
research conducted over the last three decades. However, most previous work 
has focused on languages like Fortran and Java. 

The C language presents new challenges for array-range analysis. First, arrays 
can be accessed indirectly via pointers, so pointer arithmetic becomes an alter- 
native way to compute the index into an array. Second, type-casts and unions 

* This work was supported in part by the National Science Foundation under grants 
CCR-9987435 and CCR-0305387. 
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allow an array of one type to be accessed as an array of a different type, possibly 
with a different size. Third, even deciding what is an “array” is difficult, espe- 
cially with heap allocated storage, where the same mechanism (a call to malloc) 
is used whether one is allocating a single object or an array of objects. This also 
means that a pointer dereference to a single object and a pointer dereference to 
an array object cannot always be syntactically differentiated. 

Given these features, we approach the problem of range analysis for C by 
treating all pointer dereferences as array accesses, and by treating each solitary 
object as an array with one element. Since an array indexing expression a[i] is 
semantically equivalent to the dereference *(a+i), the analysis can be described 
purely in terms of pointer dereferences and pointer arithmetic, rather than array 
accesses and array-index computation; hence the name Pointer-Range Analysis. 

This paper describes a Pointer-Range Analysis algorithm to compute, for 
each dereference in a program, a safe approximation of the set of memory lo- 
cations that may be accessed by the dereference. An abstract representation of 
ranges is presented that can safely and portably handle challenging aspects of 
analyzing C pointers, including: 

• Pointer arithmetic. 

• Type mismatches, which arise due to unions or casts. 

• Imprecise points-to information, where a pointer may point to one of several 
arrays. 

The pointer-range representation has three components: target location, target 
type, and offset range. The separate tracking of the target’s location and type 
allows a single location to be treated as different types, and allows precise type 
information to be maintained when location information is lost as a result of 
analysis imprecision. Maintaining types rather than numeric values of sizes pre- 
serves portability, allowing the analysis to be applied when exact sizes of types 
cannot be assumed. Experimental results are presented that show the potential 
utility of pointer-range analysis in various contexts, such as eliminating unnec- 
essary bounds checks and identifying non-aliasing accesses. 

2 Representing Ranges 

We define Pointer-Range Analysis as a forward dataflow-analysis problem, where 
at each edge of the control-flow graph (CFG), a mapping is maintained from each 
location x to an abstract representation of the range of values x may hold at 
runtime. This abstract representation must be a safe approximation (represent 
a superset) of the actual range of possible values. We follow the convention in 
dataflow analysis of performing a meet (n) at control-flow merge points, so the 
elements of the abstract domain must be partially ordered such that ri C V2 
implies ri is more approximate than r2', i.e., the range represented by ri is a 
superset of the range represented by r2- 

When dealing only with numeric values, the Integer Interval Domain can be 
used to represent the ranges: 
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Integer Interval Domain X = (/, 

• I = {[min, max] \ min, max S Z U {+oo, — oo}, min < max} U { 0 } 

• [min, max] represents the set of integer values in the range min . . . max. 

• [mini, maxi] Ei [min 2 ,max 2 ] iff [mini, maxi] E [min 2 ,max 2 ], that is, 
mini E min 2 and maxi > max 2 - 

• Note that X is a lattice that satisfies our approximation requirement, 
with top element T = 0 , bottom element _L = [— oo,-|-oo], and meet 
operator FI = U. 

Figure 1(a) demonstrates the analysis of a simple example using intervals, 
and shows how the computed range can be used to decide whether an array index 
is in bounds. When assigning a constant value to i, we can map i to a precise 
interval representation (e.g., [6,6] at line 4). When the two branches merge at 
line 8, we take the meet (union) of i’s range along the two incoming branches 
to get an approximate (superset) range [3,6] of possible values for i. Since this 
falls within the legal range of [0,9] for indexing into array a, the array index at 
line 9 is guaranteed to be in-bounds. 

When dealing with pointers, however, our abstract domain must be able to 
capture information about a pointer’s target (the object to which the pointer 
may point). As a first step, we define the set hoc of abstract representatives 
of locations, or objects, defined in the program, to which a pointer may legally 
point: 

Locations Loc: 

• Loc = {u j V is a variable in the program } 

U {malloCi j z is a program point where malloc is called } 

Each location x G Loc is treated as an array object, with an associated element 
type Tx and element count ax (for solitary objects, the element count is 1). For a 
heap location MALLOCi, which represents all heap objects allocated at program 
point z, it may not be possible to determine a precise type and count; these 
values are inferred from the argument to malloc as follows: if the argument 
is a constant C, we set the type to char and count to C; if it is of the form 
C * sizeof (t) we set the type to r and count to C; otherwise, we set the type 
to void and count to 0. 



1. int a[10] ; 






1. int a [10] ; 




2. int i; 






2. int * p; 




3. if(...){ 






3. if(...){ 




4. i = 6; 


i^ [6,6] 




4. p = &a[6] ; 


p (a, [6,6]) 


5. } else •[ 






5. } else { 




6. i = 3; 


[3,3] 




6. p = &a[3] ; 


p 1 -^ (a, [3,3]) 


7. } 






7. } 




8. 


i 1-^ [3, 6] 




8. 


p (a, [3, 6]) 


9. a[i] = 0; 


(in-bonnds) 




9. *p = 0; 


(in-bounds) 



(a) Arrays, with Integer Intervals (b) Pointers, with Location-Offset 



Fig. 1. In-Bonnds Access Example 
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Location-Offset 


Descriptor-Offset 


1. int a [10] , b [8] ; 

2. int * p; 

3. if(...){ 

4. p = &a[6] ; 

5. } else { 

6. p = &b [3] ; 

7. } 

8. 

9. *p = 0; 


p (a, [6, 6]) 

p 1 -^ (b, [3, 3]) 

p T 

(don’t know) 


p (a : int [10] , [6, 6]) 

p (b : int [8] , [3, 3]) 

p (unknown : int [8] , [3, 6]) 
(in-bounds) 



Fig. 2. Multiple Target Example 



We now define a Location-Offset domain whose elements represent a pointer 
to a location plus an offset: 

Location-Offset Domain CO = (LO,Cio): 

• LO = (LocU {null}) X I 

• The element (x, [min, max]} represents the address of location x plus an 
offset in the range [min, max]; i.e., the range [kyi+min-]Tx],^'x.+max-]Tx\\, 
where Tx is the static element type of location x, and |t| is shorthand 
for sizeof (t), the size of r in bytes. 

• A NULL-targeted element (null, [mm, max]) represents the integer range 
[mm, max]. 

• {h,oi) C/o (^ 2 , 02 ) iff h = h and oi C* 02 . 

• CO can be converted to a lattice CO^ by adding a “top” element (T) 
and a “bottom” element (_L). 

Figure 1(b) shows a program that has the same behavior as the program in 
Figure 1(a), but which uses a pointer to indirectly access the array. At line 4, 
we map p to the location-offset range (a, [6, 6]) which represents the constant 
value &a-|- 6 • |int|. At line 8, the meet operation yields the range [&a -|- 3 • |int|, 
&a + 6 • |int|] of possible values for p. At line 9, when p is dereferenced, since a 
has 10 elements, we can verify that the range of p falls within the legal range 
[&a-f 0 • |int|, &a-|- 9 • |int|], thus *p will be in-bounds. 

The Location-Offset representation has two weaknesses. First, it can only 
represent a pointer to a single target location. Consider the example in Fig- 
ure 2, where p is assigned to point to two different arrays, a and b, along the 
two branches. Using the location-offset representation, the merge point at line 8 
would map p to T, since the elements (a, [6,6]) and (b, [3,3]) from the two in- 
coming branches are Tjo-incomparable, and thus the dereference at line 9 cannot 
be determined to be in-bounds. 

The second weakness of the Location-Offset representation is that it may lose 
precision when handling pointer arithmetic with mismatched types. Consider 
the example in Figure 3: At line 3, p is assigned to point to an array of 2 ints. 
However, since the static type of p is char *, the pointer arithmetic at line 4 is 
char-based, so it must first be translated to int-based arithmetic before being 
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Location- Offset 


Descriptor-Offset 


1. int a [2] ; 

2. char *p, *q; 

3. p = (char *)&a[0] ; 

4. q = p + 6 ; 

5. *q = 0; 


p 1 -^ (a, [0,0]) 
qi-^ (a, [1,2]) 
(not in-bounds) 


p i-> (a : int [2] , [0, 0]) 
q (a : char [8] , [6, 6]) 

(in-bounds) 



Fig. 3. Mismatched Types Example, assuming |int| = 4 



applied to the int-based range (a, [0,0]). Assuming jintj = 4 and [char] = 1, 
the char-based addition of 6 becomes an int-based addition of 6 • ^ = |, which 
must be approximated as the range [1,2]. With the computed fact (a, [1,2]), the 
dereference at line 5 is identified as being potentially out-of-bounds, even though 
in fact it is in-bounds. 

To address these two weaknesses of the Location-Offset domain, we track the 
type and element count of the pointer’s target separately and explicitly. First, 
we define the domain of Array Descriptors, whose elements describe the identity, 
element type, and element count of an (array) object: 

Array-Descriptor Domain T> = {D,Qd)' 

• D = Loc' X T X N 

- Loc' = LodJ{ unknown}, where unknown represents “an unknown 
location”. A flat semi-lattice {Loc ' is defined such that for all 
x, y G Loc, unknown C/ X, and x C; y iff a; = y. 

- T is the set of unqualified non- void non-array C types, with typedef s 
expanded to their underlying types, and with all pointer types 
treated as equivalent. 

• The descriptor (x, r, a) represents the location x treated as an array with 
at least a elements each of type r. For readability, we use the notation 
X : t[ct] to represent the triple (x,r, cr). Multi-dimensional arrays are 
flattened; e.g., a 2 x 3 array of integers y is represented as y : int [6] . 

• unknown : rial represents a location of unknown identity that is an 
array of at least a elements of type t. 

• (xi : Ti [cTil ) Cd (x 2 : T 2 [(T 2 ] ) iff xi C/ X 2 and n = T 2 and ai < (T 2 - 

We now define the Descriptor-Offset Domain: 

Descriptor- Offset Domain VO = {DO,Qdo}- 

• DO = {DU {null}) X / 

• The element (x : r[cr], [min, max]) represents the address of an array 
X with at least a elements each of type r, plus an offset in the range 
[min • |r|, max • |r|]. 

• A NULL-targeted element (null, [min, max]) represents the integer range 
[min, max]. 

• \di,oi) Qdo {(^ 2 , 02 ) iff di Qd d 2 and oi Qi 02 . 

(null is not Cljj-comparable to any member of D). 

• VO can be converted to a lattice VO^ by adding a “top” element (T) 
and a “bottom” element (T). 
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Notice that V is partially ordered such that d\ Qd c ?2 only if the size of the array 
described by d\ is less than or equal to the size of the array described by d 2 - 
This ensures that T>0 satisfies the safe approximation requirement, since if p 
points to an array of 8 elements, it is a safe approximation to say that p points 
to an array of 6 elements. 

The rightmost columns of Figures 2 and 3 show the analysis results using 
Descriptor-Offset ranges. For Figure 2, the meet operation at line 8 sets the 
location component to unknown, but the type, count, and offset components 
are preserved: we are able to approximate the two incoming facts for p by taking 
the smaller type-count descriptor and the superset of the interval components. 
When dereferencing p, if p maps to an element {x : t [ct] , [min, max]) such that 
min > 0 and max < a, then the dereference is guaranteed to be in-bounds, even 
if a; = UNKNOWN (as is the case for the dereference on line 9 of Figure 2). 

For Figure 3, at line 4 we can change the type and count components of the 
range, so that array a is now treated as an array of 8 chars. This allows us to 
recognize that the dereference at line 5 is guaranteed to be in-bounds. 

3 Pointer Arithmetic 

An important aspect of pointer-range analysis is the handling of pointer arith- 
metic. The six classes of additive operations in C for integers and pointers are 
listed in Figure 4, along with their semantics in terms of integer arithmetic (note 
that pointer + pointer and int-pointer are not allowed). Since the two pointer ad- 
ditions +pj and +[p are similar, and the subtractions -a and -Y can be trivially 
converted to the corresponding addition with the negative of the second argu- 
ment, we will only describe the handling of +u, +Y, and -^p. 

3.1 Well- Typed Arithmetic 

An arithmetic operation is well typed if the actual types of the arguments match 
the types expected by the operation. With the descriptor-offset domain, a tar- 
geted range {x : r [cr] , o) represents a value of type r *, while a NULL-targeted 
range (null,o) represents a value of type int. 



Operator : Type Integer Semantics 





int X int ^ int 


ii+iii2 = *1 -|- *2 




int X int ^ int 


il“iii2 = *1 — *2 


+'^. 

pi 


T* X int ^ T* 


P+;A = P + {i- |r|) 


+T 

ip 


int X T* ^ T* 


i+ipP = (*-|t|)+P 


_T 

pi 


T* X int ^ T* 


1 

H- 

III 

1 

e*. 


_T 

pp 


T* X T* ^ int"^ 


Pi“ppP2 = (Pi-P2)/|r| 



^The result of subtracting two pointers actually 
has an implementation-defined type ptrdiff_t. 



Fig. 4. C Addition and Subtraction 
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The addition and subtraction of two integer intervals can be safely approxi- 
mated by the following equations^: 

• [mini, maxi] + [mm2, max2] = [mini + mm2, maxi + max2] 

• [mini, maxi] — [mm2, max2] = [mini ~ max2, maxi — mm2] 

Well-typed arithmetic on descriptor-offset ranges can be evaluated by apply- 
ing these equations to the interval components of the ranges: 

• Integer addition (+ii) of two NULL-targeted ranges: 

(null, [mini, maa;i])+ii(NULL, [mm2, max2]) 

= (null, [mini + mm2, maxi + max2[) 

• Pointer addition (+A) of a r-based range and a NULL-targeted range: 

{x : T [cr] , [mini, maa;i])+P(NULL, [mm2, max2[) 

= {x : T [cr] , [mini + mm2, maxi + max2]) 

• Pointer subtraction (~pp) of two ranges with the same target location 
X yf UNKNOWN and the same element type t: 

(x : T [ct] , [mini, max ij)-pp(x : t[ct] , [mm2, max2]} 

= (null, [mini — max2, maxi — mm2]) 

In C, subtraction of two pointers (~pp) is well defined only if the two point- 
ers point to the same array. Therefore, pointer subtraction of two ranges with 
different or unknown target locations evaluates to T. 



3.2 Mismatched- Type Arithmetic 

An arithmetic operation that is not well typed can arise because C permits cast- 
ing between pointers to different types, and between integers and pointers; it 
can also arise from the use of unions. This section addresses the handling of 
arithmetic operations on ranges with mismatched types. This includes integer 
addition (+u) with a pointer- typed argument, and pointer addition or subtrac- 
tion where the type of the operation does not match the argument type. 

How this problem is handled depends first on the requirements of the client 
of the analysis. Specifically, is the client interested only in well- typed accesses? 
If so, then the result of any pointer arithmetic operation with mismatched types 
should be T. However, this is usually too strong a requirement for C programs, 
because its weak typing discipline means any memory location could be accessed 
as if it were of any type. With this model of memory locations, we can weaken 
the definition of the array-descriptor ordering defined on page 137 so that: 

• (xi : Ti [cti] ) Ed (x2 : T2 [CJ2I ) iff xi C; X2 and In [n] | < [^2 [0-2] |. 

That is, di is a safe approximation of ^2 if the array described by di is smaller 
than the array described by d2, regardless of the element types of the descriptors. 

^ For brevity, we omit details concerning infinite bounds, which are handled by setting 
respectively the upper/lower bound to plus/minus infinity if either argument needed 
to compute the bound is infinite. 
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This means that if the size of each type is known at analysis time, we can 
convert a range’s type from Ta to Tb as follows: 



{x : Ta [ct] , [mv 



mm , max 



(a; : Tfe [ 



\Ta\ -| 


I'^al 


|Ta 1 1 




[mm ■ , 


max • y J 



( 1 ) 



We can also transform the base type of a pointer addition by adjusting the right- 
hand-side interval. A r;,-based pointer addition (+p^), where the right-hand-side 
is NULL-targeted, can be converted to a Ta-based addition as follows: 



ri+p-(NULL, [min2,max2]) ri+p“(NULL, 



mm2 



Tb 

Ta ’ 



max 2 



.IL. 

Ta 



(2) 



Transformation (1) or (2) can be used to eliminate any type mismatch, to get a 
well- typed operation that can be evaluated by the equations in Section 3.1. 

Revisiting the Figure 3 example, the addition p + 6 at line 4 has a type 
mismatch, because p maps to an int-based range, while the addition is char- 
based. We can apply either transformation (1) or (2), to get the following results: 



(a: int[2],[0,0])+;f’’(NULL,[6,6]) 

= (1)^ (a : char [8], [0, 0])+p^“(NULL, [6,6]) = {a : char [8], [6,6]) 

= (2) ^ (a : int [2] , [0, 0])+^f (null, [1,2]) = {a : int [2] ,[1,2]) 

Because of the floor and ceiling operations, there may be some loss in precision 
as a result of applying either transformation (1) or (2). It is therefore important 
to choose a transformation that minimizes loss of precision. In practice, the size 
of one of the types Ta,Tb is usually a multiple of the size of the other (making 
either or a round number), so that at least one of the transformations 
will result in no loss of precision. 

Transformations (1) and (2) can only be applied if the sizes of types are known 
at analysis time. If an analysis is designed to be portable across all platforms, 
then specific sizes of types cannot be assumed. In such a case, we can still make 
some safe approximations to get results that are more precise than T, by making 
use of portable information about the sizes of types as defined or implied in the 
C specifications: 

1. [char] = 1 

2. jcharj < |t| for any non- void C type t. 

3. jcharj < jshort] < |intj < [long] < [long long] 

4. jfloat] < [double] < [long double] 

5. jr[CT] I = |r| • cr 

6. \union {n . . .r„}| > maxi=i,..„(|rj|) 

7. \strUCt {ti . ..Tn}\ > \^i\ 

8 . \struct {t\ . . - Tn}\ < \struct {t\ . . ,t„. . .}\ 

9. |ti*| = |t 2 *| for any C types ti, T 2 . 

Item I implies that char-pointer arithmetic is equivalent to integer arithmetic 
(+p^“ = +M, = -a). Item 6 states that a union type is at least as large as 

its largest member, while item 7 states that a struct type is at least as large 
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as the sum of its constituents’ sizes (it may be larger due to padding). Item 8 
takes advantage of a subtype relationship between two structures that share a 
common initial sequence. Item 9, which states that all pointers are of the same 
size, is strictly speaking an unsafe assumption, but it is all but implied by the 
requirements that all pointers can be cast to void * without loss of information, 
and that the return value of malloc can be safely cast to any pointer type. We 
therefore assume it to be true. 

The first safe approximation, which arises often because of the way we nor- 
malize multi-dimensional arrays, is to convert a rfcr] -based pointer addition, 
where r[cr] is an array type, to a r-based pointer addition. This is done by 
applying transformation (2) with the knowledge that ^ ^ = a: 

(null, [mm2, max2\) ri+P(NULL, [mm2 • cr, max2 ■ cr]) 



Next, if we only know the relative sizes of two types, we can make the fol- 
lowing approximations for transformation (2). 



If \n\ < |Ta|, ri+p^ ( null, [mm2, maa;2]) 



ri+p“(NULL, 



/ min 2 

Vo 



if mm2 < 0 
otherwise 



If [Taj < |Th|, ri+p^ ( null, [mm2, maa;2[) 



ri+p“(NULL, 



/ mm2 
\—oo 



if mm2 > 0 
otherwise 



/ max 2 

Vo 

/ max 2 
V+00 



if max 2 > 0 
otherwise 

if max 2 < 0 
otherwise 



(2a) 

(2b) 



For the pointer addition p + 6 at line 4 of Figure 3, since we know |char| < |int|, 
we can apply transformation (2a) to get: 



(a : int[2],[0,0])+;f’’ (NULL, [6,6]) ^ (a : int [2] , [0, 0])+if (null, [0,6]) 

= (a : int [2] , [0,6]) 



Note that the resulting range is a safe approximation (superset) of the more 
precise range (a : int [2], [1,2]) obtained earlier with exact size information. 

A similar approximation can be made for transformation (1), but only in one 
direction: 



If hbl < [Taj, let n be such that 1 < n < W, 



then (x : Ta [cr] , [mm, max]) 

min if min > 0 
—00 otherwise 



{x : Tb [n • cr] 



Fbl 

/ max if max < 0 
’ \ -|-oo otherwise 



(la) 



A key here is that [tq [cr] | > |rb[n • cr]|, which ensures that the right-hand- 
side of the transformation is a safe approximation of the left-hand-side. If ti 
and T2 are scalar types, the exact ratio is not portably defined, so the only 
safe value for n is 1. But if Ta is an aggregate type, a safe n can be obtained by 
counting the number of elements in Ta that are at least as big as r^. For example, 
\struct{int long, char} \ > 3 jg safe to multiply the (7 Component of the 

resultant range by n. 
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Thus, when evaluating the pointer addition 

(x : Ta [cr] , Oi)+p^(NULL, O2) 

if |xa| < only transformation (2b) can be applied. But if |rh| < |ra|, there is 
a choice between (la) and (2a). As was the case for transformations (1) and (2), 
it is important to choose the transformation that minimizes the loss of precision. 
In general, transformation (la) is more precise if the left-hand-side offset o\ is 
[0,0]; otherwise (2a) is more precise. 

4 Experimental Results 

The pointer-range analysis was implemented as a context-insensitive inter- 
procedural dataflow analysis (operating on a supergraph of the program). Since 
the interval lattice has infinite descending chains, widening [5] is used to ensure 
convergence, and narrowing is used to obtain more precise results. A points-to 
analysis [8] pass is first performed to safely account for aliasing, and also to 
identify targets of indirect procedure calls. 

The following numbers were collected to gauge the potential utility of this 
analysis for various applications. 

Bounded and Half-Open Ranges: We count the number of dereferences *p 
for which p maps to a range with a known location and is either 

• bounded: the offset component is finite, or 

• half-open: the offset component has at least one finite bound, e.g., 
[1, -l-oo] or [— oo, 3]. 

Such ranges are potentially useful for dependence analysis, where one is 
interested in whether two dereferences may access the same memory location. 
In-Bounds Dereferences: At each dereference *p, if {x : r [cr] , [min, max]) 
such that min > 0 and max < a, then the dereference is guaranteed to be 
in-bounds. This information can be used to eliminate unnecessary bounds 
checks, and to detect potential out-of-bounds errors. 

Figure 5 presents the results of our analysis on benchmarks from Cyclone[15], 
olden[4]. Spec 95 and Spec 2000. Column (a) gives the number of lines of code 
and column (b) gives the static number of dereferences in each program. 

Using the descriptor-offset (DO) representation, column (c) gives the per- 
centage of dereferences that had bounded ranges and (d) gives the percentage 
that had half-bounded ranges. These may be contrasted roughly with the results 
of numeric range analysis given in [24] , which identified about 30% bounded and 
40% half-bounded ranges for non-pointer variables in some small benchmarks 
(100-400 statements). 

Column (e) gives the percentage of dereferences found to be in-bounds. While 
the average percentage is quite low, there are many cases, including some larger 
programs, for which over 30% of dereferences were found to be in-bounds. 

To contrast these numbers against how well Array-Range Analysis would fare, 
columns (f)-(h) give the percentages of bounded, half-bounded, and in-bounds 
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dereferences that were direct array accesses, i.e., accesses of the form a[x] where 
a is an array object. These represent the results that could be obtained using an 
Array-Range Analysis approach that does not handle pointers (e.g., [5, 24]). The 
difference is large for all three categories, confirming that handling of pointers 
is important when analyzing C programs. 

To motivate the use of the DO representation rather than the simpler loca- 
tion-offset {LO) representation, we evaluated the two ways in which DO can 
give better results than LO\ 

• multi-target: DO can represent a pointer to multiple targets, as in the Figure 
2 example. 

• transformation (1): DO allows the application of Transformation (1) or (la) 
when handling mismatched- type operations. 

We found that multi-target made a bigger difference: column (i) gives the number 
of in-bounds dereferences that were not found when the multi-target ability was 
disabled - on average about 11% of the in-bounds dereferences per benchmark. 
Most of these come from procedure calls, where different arrays of the same 
size are passed as an argument to a procedure that accesses the array. As for 
transformation (1), only 35 in-bounds dereferences were not found when this 
feature was disabled (one in gcc, 21 in mSSksim, and 13 in crafty). Overall, the 
difference between the DO and LO is significant, and shows that the type-count 
descriptor is an effective mechanism for handling challenging aspects of C. 

To measure the price of portability, we looked at the improvement in results 
if exact sizes of types are assumed, i.e., if type mismatches are handled with 
transformations (1) and (2) rather than (la), (2a), and (2b). Only five more 
in-bounds dereferences were found using exact sizes (two in gcc and three in 
gap), suggesting that in practice, the portable transformations produce results 
that are almost as good as the non-portable ones. 

One aspect of range analysis that was not described in this paper is the treat- 
ment of ranges at branch nodes. For example, consider a branch node containing 
the predicate vi < V 2 - If the before- dataflow fact mappings are: 

Vi ^ {x T [cr] , [mini, maxi]) 

V2 {x : r [cr] , [min2, max2\) 

then the after- fact mappings along the true branch will be: 

Vi 1 -^ {x : T [cr] , [mini, min(maa;i , max 2 — 1 )]) 

V2 {x : t[ct] , [max(mini -I- 1 , mm2), max2[) 

This is an important improvement to make for precision, as confirmed by Column 
(j), which gives the number of in-bounds dereferences that were missed when the 
range improvements at branch nodes were not applied ~ on average about 20% 
of the in-bounds dereferences per benchmark. 

Precise treatment of ranges at branch nodes also lets us discover infeasible 
branches. For example, at the predicate vi < V 2 , if fi’s range is entirely less than 
V 2 ’s range, then the value of the predicate is statically known, indicating that 
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Fig. 6. Analysis Times 



the false branch is infeasible. Column (k) gives the number of known predicates 
found in the programs. The large number of known predicates in vortex comes 
from a programming style where a series of procedure calls are each checked for 
success by if statements, even though some of the procedures always return the 
same value. 

Finally, as a rough indicator of the efficiency of the algorithm. Figure 6 gives 
the analysis times (wallclock time, in seconds) on a IGHz Pentium II with 500MB 
RAM, running Linux, listed in order of increasing size (by lines of code). The 
benchmarks not listed each took less than a second to analyze. 

4.1 Improvements 

The current implementation includes several weaknesses that can be addressed 
with known solutions. Among the possible improvements are adding flow- 
sensitivity or context-sensitivity to the points-to analysis [16, 10, 27], and adding 
context sensitivity to the dataflow analysis [18], but these improvements will 
increase the time complexity of the analysis. 

Another aspect that could be improved is the handling of heap-allocated ob- 
jects. Currently, only malloc calls for which the argument is a constant C or an 
expression C * sizeof (t) are mapped to a malloc location with a non-void 
type and non-zero count. Such cases account for 46% of the malloc calls in the 
programs, so there is room for improvement. Many programs use a malloc wrap- 
per to check for error conditions; this common practice becomes a problem for 
static analysis because it causes multiple conceptual allocation sites to be folded 
into a single malloc callsite. Limited use of inlining and constant propagation 
can be used to split the malloc callsite into multiple callsites, to increase the 
likelihood of having a MALLOC location with a meaningful type and count. 

4.2 Extensions 

The range analysis described in this paper only computes ranges with constant 
bounds. It relies on the presence of constants in the source code to derive mean- 
ingful ranges, and does not record information about the relationships between 
variables. Approaches that track symbolic ranges [2, 21] and constraints between 
variables [6, 7, 20, 3, 23] can significantly improve results in applications that are 
interested in bounds checking or discovering non-aliasing memory accesses. Ideas 
discussed in this paper could be applied to extend previous approaches to handle 
pointers in general. 





146 Suan Hsi Yong and Susan Horwitz 



String manipulation is another aspect of C worthy of special consideration. 
A string is conceptually a separate data type, with its own library to manipulate 
values, but its implementation on top of arrays makes it susceptible to out-of- 
bounds array accesses. Tracking the string length as a separate attribute from 
the array size, and deriving information based on the semantics of C library 
functions, can lead to more precise results when trying to discover potentially 
out-of-bounds dereferences [25,9], which is an important concern for program 
security. 



5 Related Work 

Range analysis has been around for decades, and was the motivating example 
used in the seminal paper on abstract interpretation [5], which introduced the 
notions of widening and narrowing. Other early work on range analysis relied on 
the presence of structured loops to infer loop bounds information [13,26]. Ver- 
brugge et al describe range analysis as “generalized constant propagation” [24], 
and use it for dead-code elimination and array dependence testing in the Mc- 
CAT optimizing/parallelizing compiler. Stephenson et al [22] use range analysis 
to compute the number of bits needed to store a given value in hardware. 

Patterson [19] uses range analysis for static branch prediction: each variable 
at each program point is mapped to a set of probability-weighted ranges. The 
weights are used at branch predicates to predict the likelihood of branching in 
a given direction, and is used for various code-generation optimizations. Gu et 
al [11] use range analysis to discover opportunities for array privatization and 
parallelization in loops, while Gupta et al [12] do the same for recursive divide- 
and-conquer procedures. They both use a Guarded Array Region representation 
that associates a predicate with each range. Balakrishnan and Reps [1] use range 
analysis to infer high-level information from binary code: with a range represen- 
tation of the form a x [b, c] + d, they compute value sets that are conceptually 
equivalent to the high-level notion of a variable, to enable high-level analyses 
like reaching definitions to be applied to binary code. 

These four approaches all include the notion of a “stride” in their represen- 
tation to capture the common access pattern of arrays. Wilson et al [27] also 
use a stride to improve their pointer analysis. Gonceptually, the t component 
in our descriptor-offset representation encodes the stride in a portable format, 
allowing our analysis to be used in settings where exact sizes of types cannot be 
assumed. 

Numerous approaches compute symbolic range information, to allow track- 
ing of constraints between variables, but few have dealt with pointers. Rugina 
and Rinard [21] compute symbolic ranges for variables including pointers, and 
use linear programming to identify non-intersecting ranges that could be used 
for automatic parallelization or identifying in-bounds accesses. Approaches that 
deal with G strings to identify potential buffer overruns [25, 9, 17] must neces- 
sarily handle pointers, but only char pointers; thus they do not need to address 
problems related to casting. 
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6 Conclusion 

We have presented a pointer-range analysis that extends traditional array-range 
analysis to handle pointers as well as non-trivial aspects of C, including pointer 
arithmetic and type-casting. We described two possible range representations: 
the intuitive location-offset representation, and the descriptor-offset representa- 
tion, and showed that the latter yields better results in practice. The ideas we 
have presented can provide useful insight into extending existing array-based 
range analysis to handle pointers in C-like languages. 
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Abstract. In this paper we present a scalable pointer analysis for em- 
bedded applications that is able to distinguish between instances of re- 
cursively defined data structures and elements of arrays. The main con- 
tribution consists of an efficient yet precise algorithm that can handle 
multithreaded programs. We first perform an inexpensive flow-sensitive 
analysis of each function in the program that generates semantic equa- 
tions describing the effect of the function on the memory graph. These 
equations bear numerical constraints that describe nonuniform points-to 
relationships. We then iteratively solve these eqnations in order to ob- 
tain an abstract storage graph that describes the shape of data structures 
at every point of the program for all possible thread interleavings. We 
bring experimental evidence that this approach is tractable and precise 
for real-size embedded applications. 



1 Introduction 

The difficulty of statically computing precise points-to information is a major 
obstacle to the automatic verification of real programs. Recent successes in the 
verification of safety-critical software [3] have been enabled in part because this 
class of programs makes a very restricted use of pointer manipulations and dy- 
namic memory allocation. There are numerous pointer-intensive applications 
that are not safety-critical yet still require a high level of dependability like un- 
manned spacecraft flight control, flight data visualization or on-board network 
management for example. These programs commonly use arrays and linked lists 
to store pointers to semaphores, message queues and data packets (for interpro- 
cess communication), partitions of the memory, etc. Existing scalable pointer 
analyses [21, 15, 12, 18] are uniform, i.e. they do not distinguish between ele- 
ments of arrays or components of recursive data structures and are therefore of 
little help for the verification of these programs. It is the purpose of this paper to 
address the problem of inferring nonuniform points-to information for embedded 
programs. 

Few nonuniform pointer analyses have been studied in the literature. The 
first one has been designed by Deutsch [13, 14] and applies to programs with 
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explicit data type annotations. We first redesigned Deutsch’s model in order 
to analyze languages like C in which the type information cannot be trusted 
to infer the shape of a data structure [22,23]. However both approaches rely 
on a costly representation of the aliasing as an equivalence relation between 
access paths, which makes this kind of analysis inapplicable to programs larger 
than a few thousand lines. We therefore designed a new semantic model [24] 
that is both more compact and more expressive than the one based on access 
paths. The interest of the latter approch lies in the representation of dynamic 
memory allocation using numerical timestamps, which turns pointer analysis 
into the classical problem of computing the numerical invariants of an arithmetic 
program. In the case of a sequential program, various optimization techniques 
can be applied that break down the complexity of analyzing large arithmetic 
programs as described in [2,3]. In the case of multithreaded arithmetic programs 
however, there are no proven techniques that can cope with shared data and 
thread interleaving efficiently and precisely. This is a major drawback knowing 
that most embedded applications are multithreaded. 

In this paper we present a pointer analysis based on the semantic model of [24] 
that can infer nonuniform points-to relations for multithreaded programs. From 
our experience with the verification of real embedded applications we observed 
that collections of objects are usually manipulated in a very regular way using 
simple loops. Furthermore, these loops are generally controlled by local scalar 
variables like an array index or a pointer to the elements of a list. It is quite 
uncommon to find global array indices or lists that are modified across function 
calls. Therefore, the information flowing through this local control structure is 
sufficient in practice to describe exactly the layout of arrays and the shape of 
linked data structures. We call it the surface structure of a program. In the new 
model proposed here we first perform a flow-sensitive analysis of the surface 
structure that automatically discovers numerical loop invariants relating array 
positions and timestamps of dynamically created objects. We use these invari- 
ants to generate semantic equations that model the effect of the function on 
the memory. We then iteratively solve the system made of the semantic equa- 
tions generated from all functions in the program. A similar approach has been 
applied in [26] for improving the precision of inclusion-based flow-insensitive 
pointer analyses. Our model can be seen as a natural extension to Andersen’s 
algorithm [1] in which variables are indexed by integers denoting array positions 
and timestamps, and inclusion constraints bear numerical relations between the 
indices of variables. We will carry on the presentation of the analysis with this 
analogy in mind. 

The paper is organized as follows. In Sect. 2 we define the base semantic 
model and the surface structure of a C program. The semantics is based on 
timestamps to identify instances of dynamically allocated objects. Section 3 de- 
scribes the abstract interpretation of the surface structure and the inference of 
numerical invariants. In Sect. 4 we show how to generate nonuniform inclusion 
constraints from the numerical relationships obtained by the analysis of the sur- 
face structure. The iterative resolution of these constraints provides us with a 
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global approximation of the memory graph. We describe the implementation of 
an analyzer for the full C language in Sect. 5 and give some experimental re- 
sults from the analysis of a real device driver. We end the paper with concluding 
remarks and future work. 

2 Base Semantic Model 

In [24] we have introduced a semantic model that uniquely identifies instances 
of dynamically allocated objects by using timestamps of the form (Ai, . . . , A„) 
where the A^ are counters associated to each loop enclosing a memory allocation 
command. Consider for example the following piece of code: 

Example 1. 

ford = 0; i < 10; i++) 
forCj = 0; j < 3; j++) 

a [i] [ j ] . ptr = malloc (...); 

In that model we would consider the couple (i, j) a timestamp for distin- 
guishing between calls to the malloc command. In this paper we use a simplified 
model which folds all nested loop counters into one. In the previous example, 
this would result into considering the timestamp 3i-|- j. This amounts to having 
one global counter A that is incremented whenever the execution crosses a loop 
and is reset to 0 whenever the execution exits an outermost loop. While both 
models are equivalent in uniquely identifying dynamically allocated memory, the 
loss of information about nested loop counters may lead to imprecisions when 
timestamps are represented by abstract numerical lattices [19, 11, 16, 20]. This is 
not an issue in embedded applications since almost all loops have constant iter- 
ation bounds and arrays are traversed in a regular way as in the example above. 
This type of loop invariants can be efficiently and exactly computed by using 
the reduced product [8] of the lattices of linear equalities [19] and intervals [6] 
for example. 

Because C allows the programmer to change the layout of a structured block 
via aggressive type casts, using symbolic data selectors like in [24] for represent- 
ing points-to relations is quite challenging (see [4] for a detailed discussion of 
type casting in C). In our case this would make the analysis overly complicated 
since we also have to manage numerical constraints that relate timestamps and 
positions within blocks. We choose a simple solution that consists of using a ho- 
mogeneous byte-based representation of positions within memory blocks. This 
means that a field in a structure is identified by its byte offset from the beginning 
of the structure. As a consequence we must take architecture-dependent charac- 
teristics like alignment and padding into account. Fortunately, most C front-ends 
provide this information for free. In such a model an edge in the points-to graph 
has the form (a, o) [> (a', o') where a, a' are addresses of blocks in memory and 
o, o' are byte offsets within these blocks. 

Our purpose is to abstract a C program into a system of points-to equations 
expressed by inclusion constraints similarly to Andersen’s analysis [1] . Since we 
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Stmt ::= n = c (c € IN) 
I n = m + o 
I n = m * o 
I p = &x 
I P = q + n 



I P = *q 
I *P = q 
I p = malloc() 

I while (m < n) do Si; • • • ; Sn end 



Fig. 1. Syntax of the core pointer language 



want to express nonuniform aliasing relationships, we need to assign position 
and timestamp indices to semantic variables and relate them by using numerical 
constraints. For example, we would like to generate an inclusion constraint for 
the piece of code of Example 1 that looks like: 

*(& a+ (i X s + Optr)) 3 malloct where i = t A t G [0, 29] 

where s is the size of the structure contained in the two-dimensional array, Optr 
is the offset of the field ptr in that structure and t is the timestamp of the 
memory allocation statement. In order to infer this kind of constraint we must 
first perform a flow-sensitive analysis over a relational numerical lattice [19, 
11,16,20] that computes invariants relating loop counters, array indices and 
timestamps. The main difference from [24] comes from the fact that we generate 
inclusion constraints without any prior knowledge of the layout of objects in the 
heap. In this case it is not obvious what to do with the following piece of code: 

Example 2. 

ford = 0; i < 10; i++) { 
p = p->next; 

} 

The rest of this section will be devoted to defining a concrete semantic model 
that will allow us to handle this situation simply and precisely. 

We base our semantic specification on a small language that captures the 
core pointer arithmetic of C at the function level. The treatment of interproce- 
dural mechanisms is postponed until Sect. 4 where we will detail the generation 
of inclusion constraints. We call surface variable a variable which has a scalar 
type, either integer or pointer, and which does not have its address taken. The 
syntax of the language is defined in Fig. 1, where we denote by p, q, r pointer- 
valued surface variables, by m, n, o integer-valued surface variables, and by x, y, z 
all other variables. We assume that the variable on the left handside of an assign- 
ment operation does not appear on the right handside. This will facilitate the 
design of the numerical abstract interpretation in Sect. 3. It is always possible 
to rewrite the program in order to satisfy this assumption. Note that in order 
to keep the presentation simple, we focus on fundamental arithmetic operations 
and loops. All other constructs can be analyzed along the same lines. We use 
this language to model the computations that occur locally within the body of 
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a C function, excluding calls to other functions. A program P in this language 
is just a sequence of statements describing the pointer manipulations performed 
by a function. We provide P with a small-step operational semantics given by a 
transition system (H, defined as follows. 

We first need some notations. We assume that each statement of P is assigned 
a unique label £. If £ is the label of a statement, we denote by next(£) the label of 
the next statement of P to be executed in the natural execution order. If £ is the 
label of a loop we denote by top(£) the predicate that is true iff the statement 
at £ is an outermost loop. A state of A" is a tuple (A, M, g, £) where A is an 
integer denoting the global loop counter used for timestamping, M is a memory 
graph, g is an environment and £ is the label of the next statement to execute. 
A memory graph is a collection of points-to edges (a, o) [> (a', o') where a, a' are 
addresses and o, o' are integers representing byte offsets. An address is either 
the location of a global variable &x or a dynamically allocated block blk^(t), 
where I is the location of the allocation statement and t is a timestamp. We use 
a special address null to represent the NULL pointer value in C. The mapping 
defined by a memory graph is functional, i.e. there is at most one outcoming 
edge for each memory location (a, o) . We denote by M (a, o) the target location 
of the edge originating from the location (a, o) if it exists or (null, 0) otherwise. 
We denote by M[(a, o) \> (a', o')] the memory graph M which has been updated 
with the edge (a, o) > (a', o'). 

We split down each pointer variable p into two variables Pa and po that re- 
spectively denote the address of the block and the offset within this block to 
which p points. An environment g maps variables n, po to integers and variables 
Pa to addresses. We denote by g[u <— u] the environment g in which the variable 
u has been assigned the value v. Finally, we denote by 17 a special element of 
S representing the error state. The transition relation ^ of the operational se- 
mantics is then defined in Fig. 2. An initial state in this operational semantics 
assigns arbitrary integer values to surface integer variables and the null memory 
location to surface pointer variables. This amounts to considering integer vari- 
ables as uninitialized and pointers initialized to NULL. For consistency the initial 
value of A should be 0. In our framework an initial state describes the memory 
configuration at the entry of the C function that is modeled by the program P. 

The transition rule for loop exits requires some explanations. The global loop 
counter A is incremented at the end of each loop iteration and decremented when- 
ever the execution steps out of a nested loop. Whether the global loop counter 
is decremented or left unchanged at loop exit has no effect on the uniqueness 
of timestamps. However decrementation is required in order to preserve linear 
relationships between A and byte offsets during the traversal of multidimensional 
arrays. Consider the two nested loops of Example 1. We keep the previous no- 
tations and we denote by O the byte offset within a on the lefthand side of 
the assignment. Then, the relation between O and the loop counters is given by 
0 = 3xsxi-|-sxj-|- Optr. If we use the decrementation rule at loop exit, the 
global loop counter value is given byA = 3xi-|-j, hence O = s x A -I- Optr. 
Without this rule A would be equal to 4 x i -I- j and the relationship between 
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{X,M, Q,i 
{\,M, Q,l 
(X,M, g,i 
{X,M, gj 
{X,M, gj 

{X,M, g,l 

{X,M, g,l 

{X,M, g,i 

{X,M, g,l 

{X,M, g,l 

{X,M, g,l 



: p = *q) 



: n = c) ^ (A, M, p[n ^ c], next(l')) 

: n = m + o) ^ (A, M, p[n ^ £i(m) + £i(o)], next(£)) 

: n = m * o) ^ (A, M, p[n ^ g{m) X £<(o)], next(^)) 

: p = &x) ^ (A, M, p[po ^ 0, po ^ &x], next(^)) 

: p = q + n) ^ (A, M, p[po ^ qo + p(n), pa ^ £>(qa)l, next(^)) 
n if £i(qa) = null 

{A, M, £<[(po, Po) ^ M (p(qa), £»(qo))], next(£)) otherwise 
: p = mallocO) ^ (A, M, p[pa ^ blkr{A),po ^ 0], next(^)) 

. *p = q\ ^ ^ if 

^ I (^. M[{g{pa), g{po)) > (£<(qa), ^(qo))], Q, next(£)) otherwise 
: while (m < n) do / : Si; • • • end) ^ (A, M, g, I') if p(m) < p(n) 

(0, M, g, next(£)) if p(m) > ^)(n) and top(£) 
(A — 1, M, g, next(£)) otherwise 
: end) ^ (A + 1, M, g, I' : while (...) do ... i \ end) 



: while (m < n) do ... end) - 



Fig. 2. Operational semantics of the core pointer language 



the global loop counter and O would be lost, thereby preventing the inference 
of a nonuniform points-to relation. 

This operational semantics is similar to the one described in [24] with a 
simplified timestamping. We need to instrument the semantics by adding an 
intermediate layer between the environment and the memory that keeps track of 
all memory accesses. Whenever a location is retrieved from the memory, we use 
a timestamp to tag it with a unique name that we call an anchor, and we keep 
the binding between this anchor and the actual memory location in a separate 
structure A called the anchorage. The local environment g now maps the address 
component of a surface variable pa either to an address that explicitly appears 
in the body of a C function or to an anchor. We call this refined semantics the 
surface semantics. More formally, the surface semantics (A7s,^s) of a program 
P is defined as follows. A extended state of Sg is a tuple (A, A, M, g, £) where 
{X,M,g,t) S E and A is an anchorage. An anchor ref^(t) denotes the value 
returned by the execution of a memory read command f : p = *q at program 
point £ on time t. The anchorage maps an anchor ref^(t) to an actual memory 
location (a, o). If (a, o) is a location stored in the environment p, a may either 
be an address or an anchor. We define the resolution function get^^ which maps 
(a, o) to the corresponding memory location as follows: 

{ (null, 0) if a is an anchor and A{a) = (null, 0) 

(a, o + o') if a is an anchor and A{a) = (a, o') 

(a, o) if a is an address a 

If p is a surface pointer and g is an environment, we denote by get^ ^(p) the 
memory location gety^(p(pa), p(po))- The transition relation of the surface 
semantics is then defined in Fig. 3. The error state in this semantics is also 
denoted by 17. An initial state in the surface semantics is simply an initial state 
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£»,£ : p = *q)^s 

g,l:*p = q)^s 
For all other statements: 



^ if getA,j,(q) = (null,o) 

A, A[ref^(A) ^ M(get^^^(q))], 
M,g[Y>a <— refi{X),po ^ 0],next(^) 
^ if getA,j,(p) = (null,o) 
A,A,M[getA„(p)>getA„(q)l, ^ 
g, next(^) 



otherwise 



(X,M,g,e) ^ {X' M',g' i') 
{X,A,M,g,l)^,{X',A,M',g',£') 



Fig. 3. Surface semantics of the core pointer language 



in the base semantics with an empty anchorage. We denote by I the set of all 
initial states. 

We are interested in the collecting semantics [5] of a program P, that is the 
set C = {i -^s s I f G /} of all states reachable from any initial state I. We define 
the surface structure 5 of P as follows: 

S = {{X,Q,t) I 3M 3A : {X,A,M,gJ) G C} 

An element (A, g, £) is called a surface configuration. The program P models 
the pointer manipulations performed by a single C function. Our purpose is to 
compute a global approximation of the memory for a whole C program by first 
performing an abstract interpretation of the surface structure of each function 
in the program. The design of this abstract interpretation is straightforward 
because the surface structure is independent from the data stored in the heap and 
does not interfere with other threads. We will then generate inclusion constraints 
from the results of the analysis of the surface structure that will provide us with 
a global approximation of the memory and the anchorage structure as well. 

3 Abstract Interpretation of the Surface Structure 

We describe the analysis of the surface structure within the framework of Ab- 
stract Interpretation [7, 8,5)9]- We define an abstract environment by a pair 
{ly'^jTT^) as follows: 

— The component is an abstract numerical relation belonging to a given nu- 
merical lattice V** [19, 11, 16, 20] that we leave as a parameter of our analysis. 
The abstract relation is a collection of numerical constraints between all 
integer valued variables n, po of the program and a special variable A denot- 
ing the value of the global loop counter. 

— The component maps every variable po to a set of abstract addresses. 

An abstract address is either the address of a global variable &x, a dynamically 
allocated block blkj(/x**) or an anchor refj(/it*), where is a abstract numerical 
relation between the loop counter variable A and a special timestamp variable 
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[n = c]** (z/**, TT**) = © {n}) © {n = c}, 7T**) 

[n = m + o]** 7T**) = 0 {n}) © {n = m + o}, tt**) 

[n = m * o]** (z^**, TT**) = ((z^** © {n}) © {n = m X o}, tt**) 

[P = &x]** (z/^TT**) = ((zy** © {po}) © {po = 0},7 t'*[p„ ^ {&x}]) 

[p = q + n]* (zy'*,7r'*) = ((z/** © {po}) © {p„ = q„ + n}, 7r'*[pa ^ 7r'*(qa)]) 



[£ : p = *qj‘* (z/**, TT**) 



7r“[pa ^ {reff([z^** © {t = A}Jt,a)}] 



[£ : p = malloc()]** , tt**) = ^ 
[*P = ql" (z^“,7r**) = (z/^tt") 



(Z^" © {Po}) © {Po = 0}, 

TT^lpa ^ {blk^(Lz^** © {t = A}Jt,a)}] 



Fig. 4. Abstract surface semantics of atomic statements 



denoted by r. We assume that for each set of abstract addresses, there is at most 
one abstract address blkj(/x**) or refj(/x**) per program location i. Therefore, the 
set of all abstract environments is isomorphic to the product Hie / the 
numerical lattice over a fixed family I. We provide E'^ with the structure of a 
lattice by lifting all operations of V** to E'^ pointwise. 

The denotation 7 v#(z/l*) of an abstract numerical relation is a set of variable 
assignments e that satisfy the numerical constraints expressed by z^** . If xi , . . . , a;„ 
are numerical variables and v\, . . . ,Vn are integer values, we denote by v^xx 
v\, . . . ,Xn 1 -^ Vn) the predicate that is true iff there is an assignment £ G ©v# (i/^) 
such that s{xi) = Vi for all 1 < i < n. The denotation 7 e# tt**) of an abstract 
environment is the set of all pairs (A, g) where A G IM and g is an environment 
of the surface semantics, such that: 

- p(n) , . . . , po p(po) , • ■ • , Al I— > A) for all variables n, . . . , p, . . . of the 

program 

- £'(Pa) = &x ©> &X G 7r**(pa) 

- p(pa) = blk^(t) ^ blkj(/x*) G 7rl‘(pa) A ^•*(t 

- p(pa) = ref^t) =7 refj(/x*) G 7rl*(pa) A t, A A) 

An abstract surface configuration of the program is a family t^i,oc{p) of 

abstract environments, one for each location ^ in the program P considered. We 
provide the set of all abstract surface configurations with a lattice structure by 
pointwise extension of operations from EK The denotation 7(z^^, 7r|)^gLoc(p) of 
ans abstract configuration is the set of all surface configurations (A, g, £) such 
that (A, g) G yp# 7 t|). 

Following the methodology of Abstract Interpretation, we must now define 
the abstract semantics of the language. We first have to define some operations on 
the abstract numerical lattice VK If G and lA is a set of variables, we denote 
hy p'^QV the abstract numerical relation in which all information about variables 
in V has been lost, and by [z^**J y the relation that only keeps information for 
variables in V. If S' is a system of arbitrary numerical constraints, we denote by 
z^** © S an abstract numerical relation representing all variable assignments that 
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are in the denotation of and that are also solutions of S. If n is a variable, we 
denote hy := v + c] the operation that consists of adding the increment c to 
the value of v. The implementation of these operations depends on the abstract 
numerical lattice considered, and we refer the reader to the corresponding papers 
for more details about the underlying algorithms [6, 19, 11, 16, 20]. We assign an 
abstract semantics |s]** ■. ^ to each atomic statement s of the language 

as defined in Fig. 4. 

If is an abstract environment, we define the result of the 

operation vac a{v'^ , as follows: 

- pti = := yl+ 1] 

( &X if 7rl*(pa) = &x 

- Vp : 7fl*(pa) = < h\k\{^i^[A := A+ 1]) if 7rli(pa) = blkj(/r*) 

[ ref“(/x»[A := A + 1]) if 7r«(p„) = ref“(/r») 

We define the operation deCyi(i/**, tt**) (resp. reset/i (:/•*, similarly by substi- 
tuting the operation A := A — 1 (resp. A := 0) to A := A -b 1. The abstract 
semantics of a program is then given by the least solution of a recursive system 
of semantic equations 



{4^'^b = Fti ((t'|,7rf)fgLoc(p)) 
where is defined as follows: 

— \i I = next(f') and E is the location of an atomic statement s, then 
Fi ((t'|,7rf)jgLoc(p)) = 



— If E' : while (m < n) do ^ : s; • • • ; : end, then 

Fi ((^'|,7r“-)^gLoc(p)) = {4" ® {m < n},7rj„) U \ncA{4'.4') 



— li ^ = next(f') and E : while (m < n) do ... end, then 



Ft (y{4^4'^ i&voc{p'^ 



f resetA(4' ® {m > n},7r^,) if top(E) 
\ dec a{4' ® {ill > 4') otherwise 



We apply classical fixpoint algorithms based upon iteration sequences with 
widening and narrowing [5, 9] in order to obtain an upper approximation 
of the least fixpoint of the system. 

Theorem 1. is a sound approximation of the surface semantics, i.e. S C 
7 ((j^“,7rJ)^eLoc(p))- 

For example, consider the following program in our core pointer language that 
fills in an array a of pointers with newly allocated blocks: 
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Example 3. 

1: n = 0; 

2: while (n < 10) { 

3: q = &a; 

4: p = q + n; 

5: r = malloc 0 ; 

6 : *p = r ; 

7 : n = n + 1 ; 

8 : } 

If we use the lattice of convex polyhedra [11] as the numerical lattice V**, then the 
abstract environment obtained after analysis of the surface structure at program 
point 6 is: 



{ 0 < n < 10 
A = n 

qo = To = 0 ’ 

Po = 4 X n 



Pa 

qa 

ra 



{&a} 

{&a} 

{blk“(r = yl,0<yl< 10)} 



assuming that pointers occupy four bytes in memory. 



4 Nonuniform Inclusion Constraints 

We now use the analysis of the surface structure to build a global approximation 
of the memory graph. For this purpose we use an extension of Andersen’s in- 
clusion constraints [1] enriched with numerical indices that allow us to describe 
nonuniform points-to relations. The syntax of a nonuniform inclusion constraint 
is the following: 



Cst ::= (X(t) 3 &x -|- o, o)) 

I ^ blk^(t') -I- o, t', o)) 

I {xlt)2y{t') + o,iy^t,t',o))) 

where are special index variables denoting timestamp and offset values 

and X,y are set variables. We assume that we are provided with a countable 
collection of set variables. The second component of a nonuniform constraint 
is a system of numerical relationships between the index variables appearing in 
the constraint. 

The semantics of a system of nouniform constraints is based upon an abstract 
memory graph. An abstract memory graph M'^ is a set of abstract points-to 
relations 

(a(t, o) [> o'), t', o, o')) 

where a, a' are addresses and t,t',o,o' are special index variables representing 
the timestamps and offsets associated to each address. The abstract numerical 




A Scalable Nonuniform Pointer Analysis for Embedded Programs 



159 



relation expresses numerical constraints between these index variables. The 
set of abstract memory graphs can be provided with the structure of a 
lattice by pointwise extension of the corresponding lattice operations over V**. 
The denotation of an abstract memory graph is the set of memory 

graphs such that the offsets on the points-to edges satisfy the constraints of the 
corresponding abstract edges. A valuation of set variables is a set of mappings 

(A'(t) ^ a(t') + 

where a is an address and are numerical index variables. The set Val'^ of 

all valuations can similarly be provided with the structure of a lattice. Note that 
in the case of the address of a global &x, the associated timestamp variable does 
not have any meaning and is not related by any numerical constraint. We use a 
uniform notation in order to keep the semantic definitions simple. A valuation 
can be seen as an abstraction of the anchorage structure defined in Sect. 2. The 
semantics |C]** : Af** x Val^ x Val'^ of a nonuniform inclusion constraint 

C is defined as follows: 



- l{X{t) D &x + o, = (M«, U {{X{t) ^tx+ 0 , 

- l{X{t) ^hlke{t') + o,J)f{M^,V^) = {M^ ,V^U{{X{t) ^ hlki{t') + o,J)}) 

- \{X{t) 2 y{t') + o, = (M», U {{X{t) ^ a(t") + o", 

= 0 + 0'}\t,t",o") I 

- 2 y{t'), T/«) = (M« U {(a(t, o) > a'(t', o'), 4) I 

{X{t) ^ a(t) + o, 4) € A {y{t) ^ a'(t') + o', 4) G 1^#, ytt) 

- l{X{t) 2 *y{t'), y») = (M#, y# U {{X{t) ^ + o', 4) I 

{y{t') ^ a{t") + o,4) G A {a{t",o)t>a'{t'",o'),4) € A 4 = 
[iy^n4n4kt>",o>}) 



where we have freely renamed the index variables whenever it was necessary to 
avoid name clashes. A solution of a system S of nonuniform set constraints is a 
couple {M^, V'^) which is invariant under the application of 1C]** for any C G S. 

We are interested in the least solution of a system S of nonuniform set con- 
straints. We can obtain an approximation of the least solution of S by computing 
the limit of the abstract iteration sequence with widening (M|, V]|)„>o defined 
as follows: 



f (-^0 7^0*) — (-*-A4#>-Lya/«) 

I = {Ml y„«) V v^) 



where (|C]**)^gg denotes the application of all constraints of S in an arbitrary 
order, and V is the product of the widening operators on Af** and ValK This 
provides us with an effective algorithm for computing an approximate solution of 
the system, which is similar to that defined by Andersen [1] . The main difference 
is the use of a widening operator to enforce convergence because some abstract 
numerical lattices have infinitely increasing chains of elements[6, 11,20]. Once a 
post-fixpoint has been reached using this algorithm, we can further refine the 
result by using a decreasing iteration sequence with narrowing defined in the 
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same way. We observed from our experiments that an iteration sequence with 
narrowing is always required in order to obtain precise ranges for the timestamp 
and offset variables. 

We now have to show how to extract nonuniform inclusion constraints from 
the abstract interpretation of the surface semantics. Let be the abstract sur- 
face semantics of a program P obtained from the analysis described in the previ- 
ous section. We assign a unique pair of set variables to each statement 

£ : *q = r or ^ : q = *r of P, denoting respectively the points-to sets of the 
lefthand and righthand sides of the assignment. Let be an abstract 

environment, p a pointer variable of P and X a set variable. We denote by 
Cx,-p{g^) the collection of nonuniform constraints defined as follows: 

- If &x G 7T^{pa), then 

{X{t) D &x -P o, 0 {t = A, o = PoIJi.o) G Cx,-p{g^) 

- If blkj(^tt) G 7T#(pa), then 

(X{t) 3 blkf(t') 0 o, [J n ® {t = t' ,t = A,o = Po}Jt,t/,o) G Cx,p{g^) 

- If G 7T*(pa), then 

{X{t) D + o, U ® {t = t\t = A,o = Po}Jt,i',o) G Cx,^{q^) 

Now, if £ : *p = q is a memory write statement of P and g^ is the abstract 
environment of S'^ at i, we generate the constraints: 

Cce,p{9^) U CTZe,q{g^) U Ty# 0 {t = t'})} 

Similarly, for a memory read statement ^ : *p — q we generate the constraints: 

Cc,M) U Cn,M) U {(AW 2 ®{t = t'})} 

We denote by Sp the system of all constraints generated in this way for the pro- 
gram P. Let (Mp, Vp) be an approximation of the least solution of Sp obtained 
by an abstract iteration sequence as described previously. The abstract memory 
graph Mp is a sound global approximation of the memory graph at every point 
of the program: 

Theorem 2. For all state {X, A, M, g,£) of the collecting semantics C of P, we 
have M G 

The pointer analysis problem of [24] has thus been reduced to the simpler and 
more tractable problem of solving a system of nonuniform inclusion constraints. 

We finish this formal description with a brief description of the constraint 
generation for function calls. We associate a special set variable Pi(f) to the 
t-th formal parameter of each function f of a C program. We denote by Po(f) 
the variable corresponding to the return value of f. Now consider a function 
call £ : p = f (pi, . . . , pn). Assuming that we are provided with a collection 
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df, dfi, . . . , of set variables describing the sets of addresses that may flow 
through the return value and the parameters p,pi, . . . ,pn of the function call, 
we generate the following points-to equations: 

r (^i(f) 3 dfi,Tv«) 

I 3 T V#) 

In other words, function calls are treated uniformly: there are no numerical con- 
straints on the index variables. This is not a problem in practice, since nonuni- 
form behaviours usually take place at the function level in embedded applica- 
tions. We do not detail the analysis of computed calls, which can be easily derived 
from the semantics of the memory read operation p = *q. 

We now illustrate the generation of equations. Consider the small program 
of Example 3 that Alls in an array of pointers. The equations generated after the 
surface analysis are the following: 

f 3 TZeit'), {t = t',0<t< 10}) 

< {Ce(t) D &a -f o, {0 < o < 4 X t|) 

[ (TZeit) D blk 5 (t') + o, {t = t', o = 0,0 < t < 10}) 

After solving these constraints by using an abstract iteration sequence with 
widening, we obtain the following abstract memory graph: 

{((&a, o) [> (blk 5 (t), o'),{o = 4 X t,o' = 0,0 < t < 10})} 

which describes the exact shape of the memory althrough the execution of the 
program. 

5 Experimental Evaluation 

We have implemented the static analysis described in this paper for the full C 
language. The analyzer itself consists of 9,000 lines of SML/NJ excluding the 
front-end. We have interfaced the analyzer with the ckit [17] C front-end which 
is also written in SML. We currently use the reduced product of the lattice 
of linear equalities [19] and the lattice of intervals [6] for expressing numerical 
constraints. The analyzer first translates the C program into an intermediate 
language in which all expressions and statements have been broken down using 
a 3-address format. We then perform a dependency analysis which is used to 
eliminate all arithmetic operations that are not involved in pointer manipula- 
tions. This substantially shrinks down the size of the code to analyze. Whole 
structure assignment has not been described in this paper and deserves some 
attention. There are two possible ways of handling this construct, either by ex- 
panding the assignment into a collection of individual assignments to the fields 
of the structure or by analyzing the assignment as an atomic operation. The for- 
mer is made difficult by union types and structure-breaking type casts. We chose 
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the latter approach, which requires a straightforward extension of nonuniform 
constraints in order to copy a packet of pointers at once. 

We have applied the analyzer to a real piece of software: an on-board link 
controller. The application contains about 25,000 lines of unprocessed C code. 
It is a pointer intensive program with plenty of loop constructs operating on 
multidimensional arrays of structures. It is quite representative of an average 
size embedded program, which is the main target of our analysis. Very large 
programs like those described in [25] are quite unusual. Our analysis is quite 
efficient. It takes 210 seconds to parse the files, construct the abstract surface 
semantics and generate the nonuniform inclusion constraints on a laptop with 
a 900Mhz Intel Pentium and 1Gb of RAM running Linux under VmWare. The 
resolution of these constraints only takes 21 seconds. 

The results show that the analysis does discover nonuniform points-to rela- 
tions. In particular, bidimensional arrays of distinct semaphores, arrays of func- 
tions and tables of preallocated memory blocks for dedicated memory manage- 
ment are exactly described. Surprisingly enough, the analysis uncovered a real 
bug in this application. While we were reviewing the results of the analysis we 
noticed that for some array array2 of dynamically allocated semaphores, there 
was no linear relationship between the offset and the timestamps in the points-to 
relations. The nonuniform points-to equations gave us instantly the location in 
the program where the array was initialized. The initialization code looks like: 

for (i = 0; i < 20; i++) 

for (j = 0; j < 8; j++) { 

arrayl[i][j] = semCreate () ; 
array2[j] = semCreate () ; 

} 

The first array is properly initialized whereas the second one is reinitialized 
multiple times, causing a memory leak. It should be noticed that the analysis 
sucessfully inferred a nonuniform points-to relation for the bidimensional array 
of semaphores. This bug was present from the very first version of the program 
and has never been detected during the 18 months the software has been under- 
going testing so far. This is an interesting application of this static analysis as a 
sophisticated typechecker for collections of pointers. 

6 Conclusion 

We have presented a pointer analysis that is able to infer nonuniform points- 
to relationships without the cost of existing flow-sensitive analyses [14, 24] . The 
originality of our work is that it conciliates two approaches to pointer analysis, 
abstract interpretation and constraint-based analysis, which are often opposed 
one to each other. Although we could have expressed the whole analysis within 
the framework of Abstract Interpretation [10], we think that a constraint-based 
presentation is more compact and more intuitive for both understanding and im- 
plementing the analysis. We have shown on a representative case study that our 
approach is tractable and achieves the expected level of precision. Unexpectedly, 
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this analysis has been able to detect a subtle initialization bug in a real applica- 
tion. It now remains to perform more extensive empirical studies and investigate 
the use of the analysis in a real verification tool. 
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Abstract. This paper addresses scalability and accuracy of summary- 
based context-sensitive pointer analysis formulated as a two-phase com- 
putation. The first phase, or bottom-up phase, propagates procedure sum- 
maries from callees to callers. Then, the second phase, or top-down phase, 
computes the actual pointer information. These two phases can be inde- 
pendently context-sensitive. Having observed the problems that proce- 
dural side effects cause, we developed a bottom-up phase that constructs 
concise procedure summaries in a manner that permits their subsequent 
removal. This transformation results in an efficient two-phase pointer 
analysis in the style of Andersen [1] that is simultaneously bottom-up 
and top-down context-sensitive. Context sensitivity becomes inherent 
to even a context-insensitive analysis allowing for an accurate and ef- 
ficient top-down phase. The implemented context-sensitive analysis ex- 
hibits scalability comparable to that of its context-insensitive counter- 
part. For instance, to analyze 176. gcc, the largest C benchmark in SPEC 
2000, our analysis takes 190 seconds as opposed to 44 seconds for the 
context-insensitive analysis. Given the common practice of treating re- 
cursive subgraphs context-insensitively, its accuracy is equivalent to an 
analysis which completely inlines all procedure calls. 



1 Introduction 

Modern programming practices encourage code reuse, which often results in 
programs composed of a complex network of procedure calls. For static analysis, 
this exacerbates the problem of unrealizable (or spurious) interprocedural data 
flow [24]. Context-sensitive analyses are often able to avoid unrealizable data 
flow through procedure calls thereby delivering a higher degree of accuracy than 
their context-insensitive counterparts. 

In the literature, there have been many approaches to context-sensitive 
pointer analysis. Some analyses [8,16,19,25,27] mimic dynamic execution, re- 
peating call-return sequences until the analysis reaches a global fixed point. In 
these analyses, a procedure must be re-analyzed whenever there is a change in 
its callers or callees, leading to serious scalability issues. More recent work [2,3, 
12,20,21] formulates pointer analysis as a two-phase computation. This formu- 
lation has the advantage that it can be designed to analyze a procedure at most 
twice when given a fixed acyclic call graph. 
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In the first phase of the two-phase computation, called the hottom-up phase, 
a flow function is constructed for each procedure by propagating procedure sum- 
maries from callees to callers. Then, the second phase, called the top-down phase, 
computes the actual pointer information by propagating calling contexts from 
callers to callees. 

Each phase may be independently context-sensitive or -insensitive. On one 
hand, from the perspective of a caller, distinct calls to the same procedure should 
be analyzed independently. We call this bottom-up context sensitivity. On the 
other hand, from the perspective of a callee, its calling contexts from distinct call 
sites should be treated independently. We call this top-down context sensitivity. 

Previous work [2, 3, 12, 20, 21] primarily focused on the scalability of bottom- 
up context sensitivity and provided little discussion about scalable top-down 
context sensitivity. When designed to analyze each procedure at most twice, the 
top-down phase in previous analyses has the following limitations: 

1. The merging of calling contexts from distinct call sites before analyzing a callee 
leads to spurious interprocedural data flow. 

2. The propagation of calling context information from the outputs of callers’ flow 
functions to the inputs of callees’ flow functions involves significant copying over- 
heads, impacting scalability. 

While investigating these limitations, we realized the importance of proee- 
dural side effects (loosely, a callee’s effects on a caller) in dealing with context 
sensitivity. If a program lacks procedural side effects, a context-insensitive anal- 
ysis does not derive any unrealizable interprocedural data flow. Based on this 
finding, we have reformulated summary-based analysis as follows: 

1. The bottom- up phase transforms a program into a form that lacks procedural side 
effects, yet, has the same overall pointer behavior. This involves the cutting and 
pasting of procedural side effects from callees to callers instead of the copying and 
pasting performed by previous work [2, 3, 12, 20, 21). 

2. The top-down phase consists of a single run of the context-insensitive analysis. 
Since the program is free of side effects after the bottom-up phase, the context- 
insensitive analysis does not experience any unrealizable interprocedural data flow. 
Therefore, in this form, top-down context sensitivity is inherent without any need 
to copy calling contexts during the top-down propagation. 

To demonstrate the effectiveness of the proposed scheme, we implemented a 
pointer analysis for C programs with the following characteristics. 

1. Intraprocedurally, it is a variant of the Andersen’s algorithm. In particular, we use 
the formulation by Heintze and Tardieu [14]. Fields are handled in an offset-based 
manner while array indices are ignored. For each heap allocation site, we introduce 
a unique global variable. 

2. Procedures in a common call graph cycle are merged into a single procedure. 
Effectively, recursion is handled in a context-insensitive way. Indirect calls are 
handled optimistically and iteratively (Section 3). 

3. Procedural side effects are completely specialized. Thus, given an acyclic call graph, 
the implemented analysis is as accurate as one that transitively inlines all procedure 
calls. 
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int g; 



int g; 



irlsO { 
int a; 

jadei (&g, 1) ; 
jade 2 (&a, 3) ; 



jade (int *p, int q) { 
int r ; 

*p:=q; 
r : =g+5 ; 



(a) 



p:=«!g; q:=l; 
p:=&a; q:=3; 
*p:=q; 
r : =g+5 ; 

4 



irisO { 

int a; 



pasted for 
jadei 



pasted for 
jades 



{ 

{ 



pi;=«!g; qi:=l; 
*Pl :=qi ; 



P 2 ;=*:a; qs:=3; 

*P2 : =q2 ; 



^ g:=l; 
a; =3 ; 



p : =&g *p : =q 



g:=q q:=3 } 



jadei (&g, 1) ; 
jade 2 (&a,3) ; 



r:=g+5 >g:=3 



> r:=8 



(b) 



jade (int *p, int q) { 
int r; 

*p:=q; 

^ r:=g+5; 

(c) 



Fig. 1. Example 1: (a) code fragment, (b) flow- and context-insensitive equivalent 
followed by a possible derivation where the assignments marked with > are spurious 
due to context loss, (c) the copying and pasting of jade’s summary *p:=q prevents the 
spurious derivation of g:=3 and a:=l. 



The remainder of this paper is organized as follows. Section 2 demonstrates 
the limitations of previous approaches using a simple example. In Section 3, we 
give an overview how the proposed analysis overcomes those limitations followed 
by more detailed algorithmic descriptions in Section 4. Section 5 presents em- 
pirical results. In this paper, all the algorithms are presented as bottom-up logic 
programs [22]. In each inference rule, the information above the line compose 
the conditions in which the information below the line is derived. The actual 
algorithm only tracks pointer values, thus, ignores integer constants. However, 
in examples, we use integer constants as pointer surrogates for clarity. 



2 Limitations with Previous Work 

Through a simple example, this section illustrates the limitations of previous 
summary-based analyses. At the end of this section, the same example is used 
to demonstrate how the proposed analysis overcomes such limitations. 

Example 1. The code fragment in Figure 1(a) shows typical pointer use for a C pro- 
gram. Within this example, the discussion focuses on the following data flow facts: 

1. In iris, after the first call to jade (denoted jadei), global variable g acquires 1. 
After the second call to jade (denoted jade 2 ), variable a acquires 3 

2. In both calls to procedure jade, since global variable g always evaluates to 1, local 
variable r always acquires 6. 

The top of Figure 1(b) shows a flow- and context-insensitive equivalent of 
Example 1, where the assignments from iris and jade are collected without 
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procedural boundaries and ignoring the ordering among them. The following 
derivation in Figure 1(b) depicts an application of the inference rules to these 
assignments. When Example 1 is analyzed context-insensitively, the interaction 
of p : =&g from call site j adei and q : =3 from call site j ade 2 produces two spurious 
results g:=3 and r:=8. Therefore, to capture the data flow facts in Example 1 
accurately, context sensitivity is critical. 

2.1 Bottom-Up Phase 

In a typical summary-based analysis, the goal of the bottom-up phase is to 
construct a concise flow function for each procedure. Given a calling context 
as an input, this function should return the final pointer information of the 
procedure, incorporating all the effects of its callees. 

All the flow functions are computed in a single bottom-up sweep of the call 
graph, propagating procedure summaries from callees to callers. The procedure 
summary is similar to the flow function except that, instead of the final pointer 
information for the whole procedure, it returns only the change in pointer infor- 
mation visible to the callers, called procedural side effects. 

In Example I, the only assignment in jade that causes side effects is *p:=q. 
Thus, the summary of jade consists of the single assignment *p:=q. In the 
bottom-up phase, as shown in Figure 1(c), a specialized version of *p:=q is 
pasted into each call site of jade (for instance, *pi;=qi for jadei) along with 
assignments that mimic parameter passing. 

2.2 Top-Down Phase 

The goal of the top-down phase is to compute the actual pointer information. 
Since the flow functions of each procedure are available, it can be computed in a 
single top-down sweep of the call graph. The top-down phase begins by analyzing 
procedure iris. As shown in 1(c), the inclusion of jade’s summary makes the 
effects of jadei and jade 2 exist locally. At iris, only the intended results g:=l 
and a; =3 are derived leaving the pointer information for iris accurate and 
complete. 

To analyze the next procedure, jade, the calling contexts must be formed 
then propagated (copied) from iris to jade. The calling contexts for the two 
calls to jade are: 

jadei g:=l; a:=3; p:=&g; q:=l; jade 2 ^g:=l; a:=3; p:=&a; q:=3; 

To analyze procedure jade only once during the top-down phase, these calling 
contexts must be merged before the application of jade’s flow function, as shown 
in Figure 2(a). While bottom-up context sensitivity allowed the accurate deriva- 
tion of g’s and a’s values in iris, the lack of top-down context sensitivity leads, 
again, to the spurious derivations g:=3, a:=l, and r:=8. Without intervention, 
at the completion of j ade all benefits of the context sensitive bottom-up phase 
have been lost. 
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Int g; 



jadeCint *p, int q) { 

calling contexts f p:=&g; q;=l; 
from iris 1 p:=&a; q:=3; S* 



irisO { 

int a; 

pasted for f pi:=&g; qi:=l; 
jadei I *pi:=qi; 

pasted for f p 2 :=&a; q2:=3; 
jads2 y *P2:=q2; 



jadei(&g,l); 

int r; jade2(&a,3); 

*p:=q: ^ g:=3; a:=l; } 

r:=g+5; r:=6; r;=8; 

jadeCint *p, int q) { 
int r ; 

*p;=q; cut from jade 
^ r:=g+5; 

(a) (b) 



Fig. 2. Example 1 continued: (a) in the top-down phase for jade the spurious results 
g : =3 and r : =8 are still derived, (b) after cutting out the unnecessary assignment *p : =q 
from jade, the top-down phase no longer derives spurious results. 



In practice, despite of its negative impact on scalability, explicit copying of 
calling contexts during the propagation of pointer information from callers to 
callees contains some of the contamination. When the pointer information of a 
is needed, pointer analysis clients will look up the output of iris’ flow function, 
which does not contain the spurious result a:=l. However, without true top- 
down context sensitivity, pointer analysis clients must querying jade for r and 
will obtain the spurious result r : =8. 

2.3 Observation 

The derivation of r:=8 in Figure 2(a) can be avoided by removing *p:=q from 
jade following the construction of jade’s summary. After the bottom-up phase, 
the original assignment *p : =q in j ade is irrelevant from the perspective of iris, 
since the pasting of jade’s summary provides local, specialized copies. Since g’s 
result is naturally available in jade through the calling context, the original 
assignment *p:=q in jade is unnecessary, and, more importantly, problematic. 

This observation is reflected in Figure 2(b), where *p:=q has been removed 
from jade. After its removal, only the intended results are derived for both iris 
and jade. Moreover, since iris is truly unaffected by jade after this change, 
iris’s pointer information does not need to be set aside through explicit copying. 
The rest of this paper presents a generalization of the process of converting a 
program with procedural side effects to an equivalent one without any. 

3 Overview 

In C programs, indirect calls can be made using function pointers. In the presence 
of indirect calls, a cyclic dependency exists between call graph construction and 
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pointer analysis. Without having a complete call graph, pointer information may 
be incomplete. On the other hand, without complete pointer information, the 
call graph may be incomplete. 

The proposed analysis breaks this cyclic dependency using an iterative ap- 
proach as in [3, 16]. It begins with an incomplete call graph consisting of only 
direct calls. Based on this incomplete call graph, pointer information is con- 
structed. Then, using this as feedback, the call graph is updated. This process 
continues until there are no more changes to the call graph. Since the bottom-up 
phase does not terminate in the presence of recursive procedures, they are merged 
into a single procedure, while the recursive calls between them are converted into 
a set of parameter-passing assignments. Effectively, recursion is handled context- 
insensitively. 

3.1 Impact of Side Effects 

The rationale behind the proposed pointer analysis is best demonstrated by 
examining the impact of procedural side effects on the accuracy of context- 
insensitive analysis. Let us reconsider the example in Figure 1. By analyzing 
this example context-insensitively, we can derive following results. 

[p:=&g]i *p:=q [p:=&a ]2 *p:=q [p:=&g]i *p:=q 

g:=q [g:=l]i a:=q [q:=3]2 g:=q [q:=3]2 

g:=l a:=3 t>g:=3 

The first two derivations are from call sites 1 and 2 respectively and are both 
realizable in some dynamic execution of the code fragment. However, intermixing 
the parameter assignments from these two call sites results in the third, spurious, 
derivation reproduced from Figure 1(b). The key problem in this derivation stems 
from the fact that g;=q is valid only within the call site jadei while q:=3 only 
within jad 02 . Since they are from two different call sites, the two should not be 
allowed to interact. 

As proved in the technical report [18], no such spurious interaction can occur 
if a program completely lacks procedural side effects. Even though a procedure 
may have many input calling contexts, without procedural side effects there is 
no way in which the contexts can interact. For side-effect free programs, full 
context sensitivity is inherent even when using a context-insensitive analysis. 

3.2 Specialization of Side Effects 

In reality, most programs have procedural side effects. Therefore, to exploit this 
finding, the proposed analysis hoists specialized copies of the procedural side 
effects from callees into callers and, then, cuts the side effects from the callees. 
The specialization continues until the program is free of procedural side effects 
thereby allowing the use of a context-insensitive style analysis in the following 
phase while avoiding all spurious interprocedural interaction. 
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kateO { 



} 

lily (int 



} 



int a,b,c; 


kateO { 

int a,b,c; 


lilyCint *p, int q) { 
int r , s ; 


lilyi (&a, 1) ; 
lily 2 (&b,3); 
c:=a+b; 


1 pi:=&a; qi:=l; ^ 

\ *pi:=qi; ’ 


f xs:=&r; ys:=q; 

1 *X3:=ys; 


*p , int q) { 
int r , s ; 

marys (&r,q) ; 
mary4 (&s ,5) ; 
*p;=r ; 


1 P2:=&b; q2:=3; 

^ *P2:=q2J 

lilyi (&a, 1) ; 
lily2(&b,3) ; 
c ; =a+b ; 

} 


1 X4:=&s; y4:=5; 

1 *X4:=y4; 

marys (&r,q) ; 
mary 4 (&s ,5) ; 
*p : =r ; 

} 



maryCint *x, int y) { 
int z,w; 

z:=y; *x:=z 
w:=*x+7 



maryCint *x, int y) { 
int z , w ; 



} 



z;=y; *x:=z 
w;=*x+7 



(a) 



(b) 



Fig. 3. Example 2: (a) code fragment, (b) demonstration of the specialization of pro- 
cedural side effects. 



Example 2. The code fragment in Figure 3(a) will be used to illustrate the key aspects 
of the specialization process in the proposed analysis. 

1. In procedure mary, the assignment *x:=z is the direct cause of side effects. By 
specializing *x:=z into mary’s call sites (pasting its copies into call sites while 
cutting the original one), mary is left free of side effects, as shown in Figure 3(b). 

2. Procedure lily also has side effects (due to *p:=r). Note that, when procedure 
lily is visited, the copies of mary’s summary are already present in lily. For this 
example, compaction (or simplification) is important to reduce the size of lily’s 
summary. 

The bottom-up phase of our analysis traverses the acyclic-rendered call graph 
in a reverse-topological order, specializing side effects along the way. First, all the 
potential side-effect derivations are identified based on the concept of criticality. 
Informally speaking, we say that an assignment is critical if it has the potential 
to cause a side effect. In Example 2, the assignments *x:=z in mary and *p:=r 
in lily fall into this category. 

The role of a procedure summary is to promote all the necessary side effects to 
the caller’s level where the contexts will no longer intermix. Critical assignments 
are used as a seed in the creation of the summary. The pasting of summaries 
permits cutting out critical assignments from procedures, leaving them free of 
side effects while maintaining the effect of the original assignments. 

Summary size is the single most important factor governing the scalability 
of this pointer analysis algorithm. In an extreme case, an entire callee can be 
used as a summary. However, considering their transitive impact on scalability, 
it is necessary for summary sizes to be kept as small as possible. For instance, in 
mary, two assignments *x : =z and z : =y can be compacted into the single assign- 
ment *x : =y reducing the size of the summary. In addition, since the assignment 
w:=*x+7 causes only local effects, by ignoring it, one can further reduce the 
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summary size. Back-tracing is used to construct compacted assignments to be 
added into the summary. The following describes the actual back-tracing process 
in mary and the key intermediate decisions made during the process. 

1. Back-tracing is initiated from the critical assignment *x:=z. However, since its 
right-hand side z is not a parameter, the addition of *x:=z into the summary is 
deferred and each assignment that modifies z is examined. 

2. The assignment z:=y is the only assignment modifying variable z. Thus, by back- 
substituting z in *x : =z with y, the data flow is compacted into *x : =y. Since both 
X and y are parameters, no further compaction is necessary. Therefore, *x:=y is 
added to many’s summary. 

After the summary *x:=y is formed, it is pasted into each call site in lily 
while the critical assignment *x;=z is cut from mary. 

3.3 Coping with Aliases 

For correctness, back-tracing must consider all data flow making it necessary 
to determine every location in which a variable may be modified. In reality, 
aliasing complicates this determination because complete data flow information 
is no longer explicit within the program text. The final decision on the form of 
the summary must be differed to higher level and will be formalized in Section 4. 
Consider procedure lily following the processing of mary. 

1. Since the assignment *p:=r has the potential to modify an external variable, to 
construct lily’s summary, back-tracing is initiated from the assignment. 

2. Note that no explicit data flow exists into r. However, since r’s address was taken 
at X 3 :=&r, there may be implicit data flow that modifies r. In this case, all the 
data flow of r can be determined through a local examination on lily, resulting 
in the derivation of the implicit data flow r : =q from the specialized copy of many’s 
summary for the call site marys. 

3. By continuing back-tracing, *p:=r and the implicit data flow r:=q are compacted 
into *p : =q. Since there is no other modihcation of r, the assignment *p : =q suffices 
as lily’s summary. 

After lily is processed, all side effects in Example 2 can be removed while 
leaving the overall pointer behavior unchanged. Then, the second phase applies 
a single run of the context-insensitive analysis and computes all pointer informa- 
tion completely and accurately. Note that, during this step, all the effects of the 
original critical assignments, namely, *x;=z in mary and *p:=r in lily, exist at 
the caller’s level and are propagated back into the callees via regular parameter- 
passing mechanisms. After specialization, mary’s parameter x still points to r 
and s. Thus, *x correctly evaluates to 1, 3, and 5 in the assignment w:=*x+7. 

An important aspect in lily that simplifies summary generation is that the 
aliases of the address-taken variables r and s could be completely and accurately 
determined without knowledge of lily’s calling context. Unfortunately, this is 
not always possible. 
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ninaO { 

int a,b; 
int *c,*d,*e; 
olivi (&a, 1 ,&c ,&c) ; 
oliv 2 (&b,3 ,&d, fee) ; 

} 

olivCint *p, int q, 

int **r, int**s) { 

int x,y; 

1: x:=q; 

2; ♦r:=&x; *p:=**s; 
3: y:=*x+5; 

} 

(a) 



ninaO { 

int a,b; 
int *c,*d,*e; 

I xi:=&q; 

I *ri:==ξ *pi:=**s; 
j X2:=&q; 

I *r2:=ξ *p2:=**s; 

olivi (&a, 1 ,&c ,&c) ; 
oliv 2 (&b,3 ,&d, &e) ; 

} 



(b) 



olivCint *p, int q, 
int **r, int**s 
int x) { 
int y; 
x:=q; 

*r:=&x; *p:=**s; 
y : =*x+5 ; 

} 



Fig. 4. Example 3: (a) code fragment before and (b) after side effects are specialized. 



Example 3. The code fragment in Figure 4(a) depicts a situation where accurate alias 
relations may not be determined without the knowledge of the actual calling contexts. 
The following are the key aspects of the example that complicate the specialization 
process. 

1. In oliv, by the assignment *r:=&x, the address of local variable x is assigned to 
variables nonlocal to oliv. Note that this is a safe use of x since it is used only 
when oliv is active. 

2. In the first call site olivi, since r and s point to the same variable c, x is copied 
into *p at Line 2. Therefore, there is a flow from q through x into *p, allowing 

a: =1. 

3. In the second call site oliv 2 , since r and s point to distinct variables, x is not 
copied into *p at Line 2. Instead, the contents of *e is read. Therefore, there is no 
flow from q to *p, thus b does not acquire 3. 

The last two aspects raise a dilemma. For the first call site olivi, the flow 
*p : =q must be reflected in the summary. However, for the second call site oliv 2 , 
its inclusion would degrade the solution’s accuracy. The problem is that the data 
flow in the summary depends on the calling contexts which are not available 
during the bottom-up phase. 

There are multiple approaches to dealing with this dilemma. First, the assign- 
ment *p : =q can conservatively be included in the summary. Despite of accuracy 
degradation, this approach is sound. Second, two versions of summaries can be 
explicitly provided: one version for the case *r and *s are aliases and the other 
version for the case they are not. The summaries in [2] fall into this category. 

Instead, in this paper, we take another approach. As shown in Figure 4(b), 
by introducing x:=q, *r:=&x, and *p:=**s into the summary, the alias of x can 
be resolved at the caller’s level. Note that, to do so, &x and all potential writes 
to X have been included into the summary. 

This decision has an impact on summary size. In an extreme case, where every 
variable behaves like x, the summary size will explode rapidly. However, in C 
programs, the lifetime of local variables is bounded by their declaring procedure 
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u:—*v holding{v) 
holding{u) 
u:=v holding(v) 
holdingiu) 

*^u\—v v.—Szw holding{u) 
opaque{w) 
u:—Scv opaque{u) 
opaque{v) 

param{u) V opaque(u) 
holdingiu) 
u\—Szv opaqueiv) 
holdingiu) 

(a) (b) 

Fig. 5. Symbolic derivation of side effects: (a) symbolic simulation of the effects of 
calling contexts, (b) determination of criticality. 

and this constraint greatly influences typical programming practice. For this 
reason, the address of a local variable is rarely copied into a non-local memory 
space (which might be safe) or returned (which would be a bug). 

The promotion of variable x from the callee oliv to the caller nina has 
another consequence. As shown in Figure 4(b), the assignment y : =*x+5 accesses 
variable x, but only as a consumer. Thus, it causes only local effects and did 
not become a part of oliv’s summary. In order to keep the overall information 
unchanged, especially for variable y, we merge all specialized versions of variable 
X (xi and X 2 ) by treating them as extra arguments at the corresponding call 
sites as shown in Figure 4(b), Note that, after the merging, the following run 
of the context-insensitive analysis computes the pointer information completely 
and accurately. 

4 Specialization Algorithm 

This section presents the specifics surrounding the identification of critical as- 
signments and the assembly of concise summaries. The proposed analysis re- 
moves procedural side effects of a program by hoisting side effects upward in 
the call graph. Side effects are specialized until they can no longer interfere spu- 
riously. When a procedure is about to be processed, all the side effects of its 
callees are already locally available in the procedure while the callees are left 
free of side effects. Therefore, in processing an individual procedure, we need to 
be concerned only with intraprocedural data flow. 

4.1 Detecting Criticality 

As shown in the example in Figure 3, identifying critical assignments is the key to 
identifying procedural side effects. The difficulty in finding critical assignments 
lies in the fact that, when a procedure is being processed during the bottom-up 
traversal, no information about its calling contexts is available. 



*11 V holdingiu) 
criticali^u v) 

u V opaqueiu) 
criticaliu := v) 
u *v opaqueiu) 
criticaliu *v) 
u Szv opaqueiu) 
criticaliu Szv) 
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This difficulty is resolved by capturing the effects of calling contexts through 
two properties holding, opaque : Var — > Bool. Roughly speaking, a variable 
is said to be holding if it may point to an external variable. Provided that 
such information is available, critical assignments can be found by the rules in 
Figure 5. 

Example 4- In the following code fragment, the parameter p is copied into the local 
variable b by the assignment *a : =p. Therefore, since b may point to an external variable, 
the assignment *b:=r is critical. Using the holding property described in Figure 5(a), 
the proposed analysis determines the criticality of *b:=r as shown in the following 
derivation. 

a:=&b *a:=p param(p) 

peteCint *p, int q) { ~ holding{p) 

int *a,b; 

a:=&b; *a:=p; *b:=q; *b:=q holding(h) 

^ cntical{*h:=q) 

In certain cases, the address of a local variable can be copied into an external 
variable. In the proposed analysis, this local variable is called opaque in the sense 
that its actual data flow becomes opaque due to the lack of calling contexts. This 
was seen in Example 4 where variable x in oliv is opaque. For a similar reason, 
a variable pointed to by an opaque variable must be regarded as being opaque 
as well. Note that, to be conservative, an opaque variable must be regarded as 
holding, too. 

In back-tracing, an opaque variable is problematic because its alias informa- 
tion is unknown making it unclear where or how it is modified. As explained 
in Section 3.3, we handle this ambiguity by promoting the relevant assignments 
into the callers thereby hoisting the alias resolution to a point where more in- 
formation about the calling contexts is known. This implies that all potential 
modifications of such variables must be reflected in the summary. From this per- 
spective, as depicted by the rules in Figure 5(b), it is convenient to treat an 
opaque variable as external. 



4.2 Back- Tracing 

Given critical assignments, including those due to opaque variables, summa- 
rization begins by initiating back-tracing as follows, where e and e' stand for 
arbitrary expressions: 

critical{e~e') 

traced{e:=e') 

Though the back-tracing rules used by our analysis (shown in Figure 6) appear 
complicated, the invariants behind them are straightforward: once an assignment 
is traced, all its effects must be reproduced by the summary. Actual insertion of 
a back-traced assignment into the summary is deferred until it becomes unavoid- 
able and will occur due to one of the following conditions: First, back-tracing 
has reached an input to the procedure (parameter or opaque variable). Second, 
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traced{^u := v) u := w holding{w) 
traced{*^w ;= v) 

traced{*u v) u *w holding{w) 
add{*u v) A traced{u *w) 

(a) stuck left-hand 



traced(*n ;= v) v := w holding{w) 
traced{*u w) 

traced{*u v) v Szw 

v) A traced{v S-cw) 

traced{*u v) v holding{w) 

add{^u v) A traced{y 

(b) stuck right-hand 



traced{*u := v) input{u) input{v) 
add{*u v) 



(c) stuck both sides 



traced{u v Szw 

traced{u 8zw) 

traced{u v) v *w holding(w) 
traced{u *w) 

traced{u := v \= w holding{w) 
traced{u w) 

traced(u := v) input(v) 

add(u v) 

traced{u Szv) 

add{u := Szv) 

(d) 



traced{u\-*-v) v:—w holding{w) 
traced{u:—^w) 

traced{u := *^v) v *w holding{w) 
add{u := A traced{v *w) 

traced{u input{v) 

add{u *v) 



(e) 



Fig. 6. Back-tracing (a)(b)(c) store, (d) plain and address, (e) and load assignments. 



the net effect from two assignments cannot be represented with a single assign- 
ment in the model language (a single edge in the actual implementation). For 
instance, u:=**v is not allowed, thus, must be broken into u:=*t and t:=*v. 

5 Empirical Results 

To demonstrate the usefulness of our approach, we evaluated the presented tech- 
niques on all the C benchmarks from the SPEC 92, 95, and 2000 suites except 
redundant ones. This breadth of benchmarks shows a wide range of analysis 
characteristics. The main results are summarized in Table 7. 

Since field information can greatly affect the analysis results, the analysis is 
implemented in a field-sensitive fashion. Our implementation tries to handle as 
many abuses of the C language as possible in a safe and accurate manner. For 
instance, our analysis is offset-based, as opposed to using field names as in [19]. 
Each variable is associated with accessed offsets, which are dynamically updated 
should any additional offsets be discovered during the analysis. 

To reduce summary size, the bottom-up phase performs redundancy elimina- 
tion [17] and cycle elimination [9]. In addition, data flow through global variables 
are treated context-insensitively, which allows their exclusion from procedural 
summaries while not impacting resultant accuracy. 

Table 7 lists the percentage reduction in total points-to set size when com- 
paring context-insensitive (Cl) results to context-sensitive (CS) results. This is 
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Benchmark 


LOG 


Cl 


CS 


R% 


BU 




164.gzip 


7759 


0.02 


0.01 


0.01 


2 % 


175. vpr 


16973 


0.25 


0.10 


0.01 


35^ 


176. gcc 


205747 


45.00 


190.00 


2.50 


“2^ 


181. mcf 


1909 


0.01 


0.02 


0.01 


- % 


186. crafty 


18977 


0.10 


0.07 


0.01 


3T% 


197. parser 


10924 


0.40 


0.15 


0.02 


- % 


253.perlbmk 


57541 


360.00 


800.00 


45.00 


4 % 


254. gap 


59674 


120.00 


430.00 


18.00 


1 % 


255. vortex 


52634 


6.00 


10.00 


1.00 


- % 


256.bzip2 


4637 


0.01 


0.01 


0.01 


- % 


SOO.twolf 


19749 


0.30 


0.15 


0.02 


- % 



Benchmark 


LOG 


Cl 


CS 


R% 


^u 




008. espresso 


13505 


1.00 


0.30 


0.03 


89 % 


023.eqntott 


3393 


0.02 


0.02 


0.01 


- % 


099. go 


28547 


0.50 


0.15 


0.02 


- % 


124.m88ksim 


17251 


0.70 


0.40 


0.04 


4 % 


129. compress 


1426 


0.01 


0.01 


0.01 


- % 


130. li 


6930 


8.00 


2.00 


0.03 


- % 


132.ijpeg 


25897 


3.50 


35.00 


1.00 


~8~% 


134. perl 


23969 


2.50 


30.00 


1.50 





Fig. 7. Analysis time in seconds for the context insensitive (Cl) and context sensitive 
(CS) analyses. The CS time is divided into the bottom-up (BU) and top-down (TD) 
portions. It also shows the percentage reduction of total points-to set size in percentage 
(the column R%) when switching from Cl to CS. Times were measured on a 2.4 GHz 
Pentium 4 computer and never consumed more than 300 MB of memory. 



Benchmark 


Proposed 


Foster et al. 


^u 
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008. espresso 


0.31 


0.03 


0.10 


28.81 


967.64 


33.6 


023.egntott 


0.02 


0.01 


0.50 


1.50 


11.20 


7.5 


129. compress 


0.01 


0.01 


1.00 


0.41 


1.42 


3.5 


130.11 


1.96 


0.03 


0.02 


189.49 


9929.88 


52.4 



Fig. 8. Analysis time in seconds compared to Foster et al. [12]. Ratio is the ratio of 
TD time to BU time. There is a clear difference in ratios, with the TD phase for the 
proposed analysis only contributing to a small percentage of the total time. 



calculated by summing the points-to set size for every location (variable off- 
set) in the benchmark. While this may not directly correlate to beneficial accu- 
racy improvement, we feel it provides useful comparison. There is a substantial 
range in benefit with some benchmarks showing little or no benefit (256.bzip 
and SOO.twolf) while others show a fairly substantial reduction (008. espresso at 
89% and 175. vpr at 35%). 

Table 7 lists two sets of analysis times (in seconds) for Cl and CS analysis 
runs. The CS run times are broken into bottom-up and top-down times. When 
comparing the Cl and CS analysis times, there is a large span ranging from 
the cases where the CS times being faster (008. espresso, 130. li, 175. vpr, and 
197. parser) to the other cases where the CS times being about 5-lOx slower 
(132.ijpeg, 134. perl, and 254. gap). It is interesting to note that there is a strong 
correlation between the Cl and CS accuracy and analysis time comparison. The 
more accuracy benefit the CS analysis provided, the faster the analysis performs 
with respect to the Cl analysis. 

It is also enlightening to compare the relative cost of the bottom-up and 
top-down phases. Aside from those benchmarks that require less than about 0.1 
second, the top-down phase of the analysis contributes little to the overall anal- 
ysis time. It is generally between 1% and 10% of the bottom-up time. This is 
expected, because, while the bottom-up phase spends time summarizing proce- 
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Fig. 9. Semi-log graph showing the number of variable locations versus call graph depth 
for both exhaustive inlining and our proposed summary-based inlining across SPEC. 



dures and specializing procedural side effects, the top-down phase only needs to 
propagate pointer information in a context-insensitive fashion. 

To provide a frame of reference concerning the efficiency of our top-down 
phase, Figure 8 presents top-down and bottom-up analysis times from the bench- 
marks in common between our work and Foster et al. [12]. The important com- 
parison is the ratio of top-down to bottom-up time for each of two works. For 
our analysis, the top-down time is never more than the bottom-up time, even for 
a benchmark as small as 129. compress, and becomes increasingly insignificant 
as benchmark size increases (2% for 130. li). We suspect that the copying over- 
heads are what cause the top-down phase in Foster et al. to generally surpass 
the bottom-up time, reaching a factor of 50 for 130. li. 

One major concern in terms of scalability is the problem size growth caused 
by the inclusion of specialized summaries. Figure 9 demonstrates the effective- 
ness of our bottom-up phase by comparing the theoretical growth in exhaustive 
inlining (upper graph) against the empirical results using our summary-based 
analysis (lower graph). In the figure, procedure main is always at a depth of one 
while leaf procedures are at the far right of a particular line spanning depths 
from 11 to 39. The y-axis is a logarithmic scale of the number of variables at 
that level. Thus, the values in the upper graph extend to 1 trillion while the 
lower graph extends only to lOOK. 

It is apparent that the growth in our bottom-up phase is not explosive. An 
interesting aspect of the lower graph is that the size of summaries does not grow 
monotonically as it approaches main. Instead, it varies almost independently 
of the depth, increasing and decreasing according to compaction opportunities. 
The only spikes in the lower graph are due to the existence of large recursive 
cycles (100s of procedures). Even for these, summarization quickly diminishes 
their impact. 

6 Related Work 

Pointer analysis has been studied extensively in the literature. For a more com- 
plete list of previous pointer analyses, we refer readers to [15]. Context-sensitive 
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pointer analyses have been approached from several directions. In [8, 16, 19,25, 
27], procedures are re-analyzed until a global fixed point is reached. Among 
these, [27] is unique in the sense that the overhead of re-analysis is reduced by 
memoization. On the other hand, the analyses in [2,3,12,20,21] are based on 
procedure summaries formulated as two-phase computation. Among those, Fos- 
ter et al. [12] is the closest to the proposed pointer analysis. One of the analyses 
presented by Foster et al. is a context-sensitive extension of Andersen’s analy- 
sis. The bottom-up phase of this algorithm is similar to ours in many aspects. 
The difference is that, in Foster et al., the critical assignments in callees still 
exist after the bottom-up phase. Therefore, in addition to accuracy degradation, 
explicit copying seems unavoidable. 

CFL reachability [24] provides an alternative approach to context sensitivity. 
One of its merits over other approaches is that recursion can be handled also 
context-sensitively. Fahndrich et al. [10] employed a variant of CFL reachability 
with unification-based modeling proposed by Steensgard [26]. Das et al. [7] took 
a similar approach using the data flow modeling proposed by Das [6] yet with 
one-level context sensitivity. 

The philosophy behind our pointer analysis is similar to eounter- example 
directed refinement, a popular scheme in model checking [5]. In this approach, 
refinement is driven by feedback from a less accurate abstraction. In our pointer 
analysis, instead, refinement is performed preemptively and exhaustively. Guyer 
and Lin [13] applied counter-example directed refinement to pointer analysis. 
When results computed by less accurate pointer analysis (both flow- and context- 
insensitive) turn out to be insufficient, the accuracy is refined by adding flow or 
context sensitivity. To add context sensitivity, a procedure call is specialized 
by cloning the callee’s body. After being specialized, all the procedures are re- 
analyzed until a global fixed point is reached. From this perspective, the tech- 
niques proposed in this paper can also greatly aid the efficiency of their analysis. 

Many algorithms, for instance, [11,12], employ techniques similar to back- 
tracing in an effort to construct a concise yet observably -equivalent summary 
of the analysis information. Constraint simplification in general has been thor- 
oughly studied. For a more complete list of references and in-depth discussion, we 
refer readers to [23] . The concept of opacity is similar to the escaping of objects 
extensively discussed for object-oriented programs. For the list of references, we 
refer readers to [4]. 

7 Conclusion 

We have proposed a fully context-sensitive yet very efficient summary-based 
pointer analysis. The key aspect is that the bottom-up phase transforms a pro- 
gram into one lacking procedural side effects by cutting side-effect causing assign- 
ments from callees after they have been summarized. This prevents all spurious 
interprocedural data flow in the following top-down phase while preserving the 
pointer behavior. Empirical results support the effectiveness of the pointer anal- 
ysis both in accuracy and efficiency. We thank the IMPACT research group for 
their help and DARPA/MARCO-GSRC which supported this work. 
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Abstract. A technique, based upon abstract interpretation, is presented 
that allows general gate-level combinational asynchronous circuits with 
uncertain delay characteristics to be reasoned about. Our approach is 
particularly suited to the simulation and model checking of circuits where 
the identihcation of possible glitch states (static and dynamic hazards) 
is required. 

A hierarchy of alternative abstractions linked by Galois connections is 
presented, each offering varying tradeoffs between accuracy and com- 
plexity. Many of these abstract domains resemble extended, multi-value 
logics: transitional logics that include extra values representing transi- 
tions as well as steady states, and static/clean logics that include the 
values S and C representing ‘unknown but fixed for all time’ and ‘can 
never glitch’ respectively. 



1 Introduction 

Most contemporary design approaches assume an underlying synchronous para- 
digm, where a single global signal drives the clock inputs of every flip flop in the 
circuit. As a consequence, nearly all synthesis, simulation and model checking 
tools assume synchronous semantics. Designs in which this rule is relaxed are 
generally termed asynchronous circuits. 

In a synchronous model, glitches (also known as static and dynamic hazards) 
do not cause problems unless they occur on a wire used as a clock input; with 
purely synchronous design rules ^ this can not occur. However, such safety re- 
strictions are not enforced by the semantics of either Verilog or VHDL - it is 
quite easy, deliberately or otherwise, to introduce unsafe logic into a clock path. 

We present a technique, based upon abstract interpretation [1, 2], that allows 
the glitch states of asynchronous circuits to be identified and reasoned about. The 
approach taken involves a family of extended, multi- value transitional logics with 
an underlying dense continuous time model, and has applications in synthesis, 
simulation and model checking. 

Our logics are extended with extra values that capture transitions as well 
as steady states, with an ability to distinguish clean, glitch-free signals from 

^ Exactly one global clock net driving the clock inpnts of all flip flops in the circuit. 
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dirty, potentially glitchy ones. As a motivating example, consider the circuits 
represented by the expressions (a A c) V (^a A 6) V (6 A c) and (a A c) V {~^a A b). 
With respect to steady-state values for a, b and c, both circuits would appear 
to be identical, with the latter representing a circuit that might result from 
naive optimisation of the former. Our technique can straightforwardly illustrate 
differences in their dynamic behaviour, however. Consider the critical case a = to 
and 6 = c = To, representing b and c being wired to true for all time, and a 
clean transition from false to true on a (this notation is defined fully in Section 
3): 




The result Tq may be interpreted as Hrue for all time, with no glitches’. However, 
To..i represents Hrue with zero or one glitches’, clearly demonstrating the poorer 
dynamic behaviour of the smaller circuit. 



1.1 Hardware Components 

In this paper we consider four^ basic building blocks: (perfect - zero delay) AND- 
gates, (perfect) NOT-gates, delay elements (whose delays may depend on time, 
and environmental factors like temperature, and thus are non-deterministic in a 
formal sense), and inertial delay elements. The difference between an ordinary 
delay and an inertial delay is that in the former the number of transitions on 
its input and output are equal, but in the latter a short-duration pulse from 
high-to-low and back (or vice versa) may be removed entirely from the output. 

Of course, real circuits are not so general, in particular no practically re- 
alisable circuit of non-zero size can have zero-delay. Hence real-life circuits all 
correspond to combinations of the above gates with some form of delay element. 
For the point of designing synchronous hardware all that matters is the maxi- 
mum delay which can occur from a circuit, so the exact positioning of the delays 
is often of little importance. When circuits are used asynchronously (e.g. for de- 
signing self-timed circuits without a global clock or, more prosaically, when their 
output is being used to gate a clock signal locally) then their glitch behaviour is 
often critically important. This leads to two models (the delay-insensitive (DI) 
and speed-independent (SI) models) of real hardware. In the SI model logic ele- 
ments may have delays, but wires do not; in the DI model both logic elements 
and wires have associated delay. One well-known fact about DI models is that it 
is impossible to have an isochronic fork, whereby the transitions in output from 

^ A perfect OR-gate can be constructed from perfect AND- and NOT-gates using de 
Morgan’s law. 
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output 



Fig. 1. The circuit a A -■a 



a given gate will arrive delayed contemporaneously at two other gates. Reason- 
ing in the DI model has becomine much more important recently as wire delays 
(e.g. due to routing) have become dominant over single-gate element delays in 
modern VLSI technologies [11]. 

Ordinary circuits may be embedded in our model as follows. In the SI model 
each physical logic gate at the hardware level is seen as a perfect logic gate whose 
output is then passed through a delay element. In the DI model, each physical 
logic gate is seen as a perfect logic gate whose input(s) first pass through separate 
delays. In essence, the SI and DI models of a circuit are translations of a physical 
circuit into idealised circuits composed solely of our four perfect elements. 

Now consider the circuit in Fig. 1. Seen as a perfect logic element, its output 
is always false regardless of the value of its input signal. Seen as an SI circuit 
(i.e. delays on the output of the AND and NOT), given an input Fi which starts 
at false then transitions to true and back, the circuit will be false at all times 
except (possibly) for a small period just after the rising edge of the input, when 
the upper AND-input will already be true, but before the delayed NOT-output 
has yet become false. Thus the output is Fgp if we assume an inertial delay and 
Fi if we assume a non-inertial delay^. 

In contrast, in the DI model, the separate delays on both inputs to the AND- 
gate mean that the same input signal Fi may result in small positive pulses on 
both the rising and falling edge of the input; thus the output is described as 
Fq|]^| 2 . It is important to note that any of these three possible outputs may 
occur; delays may vary with time, and can also differ on whether an input signal 
is rising or falling. 

Our abstract interpretation framework enables us to formally deduce the 
above behaviours of the circuit shown in Fig. 1. Our reasoning is correct, because 
of the abstract interpretation framework. In some situations our reasoning is also 
complete in that all abstractly-predicted behaviours may be made to happen by 
choosing suitable delay functions for the delay elements. For example, in the 
DI model, our abstraction of the above circuit maps abstract signal Fi onto 
Fo|i| 2 i but the SI model cannot produce F 2 however (positive) delay intervals 
are chosen. 



1.2 Paper Structure 

In Section 2 we define a concrete domain that models signals as (possibly non- 
deterministic) functions from time to the Boolean values. Section 3 describes 

® This argument assumes positive delays; at times later in the paper we also allow 
(non-physically realisable) delays by negative time. 
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the most accurate (though complex) of our abstract domains; Sections 5 and 6 
show how this can be further abstracted. Section 4 defines the operators neces- 
sary to model circuits, Section 4.1 discusses soundness and completeness of these 
operators. Refinement and equivalence relations are discussed in Section 7. 

2 Concrete Domain 

Definition 1. Concrete time K is continuous, linear and dense, having no be- 
ginning or end. 

Definition 2. A signal is a total function m S : K ^ {0, 1} from concrete time 
to the Boolean values. More precisely, we restrict S to those functions that are 
finitely piecewise constant'^, i.e. there exists {fci,...,fc„} which uniquely deter- 
mines and is determined by a signal s G S such that 



s{ki) = ->s(/ci+i) VI < z < n; 

s(a;) = s{ki) Vki < x < h+i; 

s(— oo) = s{x) = ^s(ki) Va; < ki; 

s(-l-oo) = s{x) = s{k„) \/x > kn. 



The function d/s = {fci, . . . , /c„} represents the bijection which returns the set 
of times at which signal s has transitions; jtf'sl represents the total number, n, of 
transitions made by s. As a further notational convenience, we denote the values 
of s at the beginning and end of time respectively as s(— oo) and s(-l-oo). 

We model nondeterministic signals as members of the set p(S); e.g. delaying 
signal s by time S, where S^in < d < 6max, gives {Ar.s(r-5) | Smin <S< 6max}- 



3 Abstract Domain 

3.1 Deterministic Traces 

Definition 3. A deterministic trace t G T characterises a deterministic sig- 
nal s G S, retaining the transitions but abstracting away the times at which they 
occur. Traces are denoted as finite lists of Boolean values bounded by angle brack- 
ets (‘{. ..)’), and must contain at least one element - the empty trace ‘{) ’ is not 
syntactically valid. 

A singleton trace, denoted (0) or (1), represents a signal that remains at 0 or 
1 respectively for all time. For traces with two or more elements, e.g. (a, . . . , 6), 
a is the value at the beginning of time and b is the value at the end of time. 

^ Note that we do not consider signals that contain an infinite number of transitions, 
e.g. clocks that oscillate for all time. We can, however, reason about such signals by 
‘windowing’ them within finite intervals (windows) [p, q\ of R, resulting in signals 
that are themselves finitely piecewise constant. 
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Table 1. Shorthand Notation: Deterministic Traces 



Fo The trace (0) that is 0 for all time. 

Fi The trace (0, 1, 0) that has 0 at the beginning and end, 

containing exactly one pulse. 

F 2 The trace (0, 1, 0, 1, 0) that begins and ends with 0, 
containing exactly two pulses. 

Fn The trace (0, li, 0, 12 , 0, . . . , 0, In, 0) that begins and ends with 0, 
containing exactly n positive-going pulses. 

To The trace (1) that is 1 for all time. 

Tn The trace (1, Oi, 1, O 2 , 0„, 1) that begins and ends with 1, 

containing exactly n negative-going pulses, 
to The trace (0, 1) that cleanly transitions from 0 to 1. 
t„ The trace (0, li, 0, . . . , 0, 1„, 0, 1) that transitions from 0 to 1 
through exactly n intervening cycles, 
to The trace (1,0) that cleanly transitions from 1 to 0. 
t„ The trace (1, Oi, 1, . . . , 1, 0„, 1, 0) that transitions from 1 to 0 
through exactly n intervening cycles. 



The trace (0,1,0) represents a signal that at the start of time takes the value 
0, then at some later time switches cleanly to 1, then back to 0 again before 
the end of time. The instants at which these transitions occur are undefined, 
although their time order must be preserved. 

Values within traces may be discriminated only by their transitions. There- 
fore, the trace (0, 0, 0, 0, 1, 1, 1) is equivalent to the trace (0, 1). It follows from 
this that all traces may be reduced to a form that resembles an alternating 
sequence (... ,0,1,0, 1,0,1,...). Any such sequence can be completely charac- 
terised by its start and end values, along with the number of intervening full 
cycles® . A convenient shorthand notation that takes advantage of this is defined 
in Table 1. 



3.2 Nondeterministic Traces 

Following the approach taken in Section 2, we represent nondeterministic traces 
t G p(T) as sets of deterministic traces®. 

The need for this extra structure is demonstrated by the following example. 
Let us attempt to specify the meaning of the expression (0, 1) A ^(0, 1), which 
represents the effect of feeding a clean transition from 0 to 1 to the a input of 
the circuit shown in Fig. 1. The ^ can be evaluated trivially, giving (0, 1) A (1, 0). 

® It is of course also possible to represent traces completely in terms of their first (or 
last) element and their length. However, the representation chosen here turns out to 
be more convenient, e.g. comparing '['g with '['4 makes it immediately obvious that 
both represent traces that eventually transition from 0 to 1, with '['g being ‘cleaner’ 
than '|' 4 . The utility of this approach will become clear later. 

® We adopt the convention that t and i are separate variables that range over T and 
p(T) respectively. 
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At first sight, it may appear that the resulting trace should be (0,0) or just (0). 
This would be the case if certain constraints on the exact times of the transitions 
of the (1,0) and (0, 1) traces were met, but it is not sufficient to cope with all 
possibilities. If (1, 0) transitions before (0, 1), then the result is indeed (0). Should 
the transitions occur in the opposite order, the result is (0, 1, 0). Formally, 

{(0, 1)} A ^{(0, 1)} = {(0, 1)} A {(1, 0)} = {(0)} U {(0, 1, 0)} = {(0), (0, 1, 0)} 

Definition 4. Where t £ p(T) and u £ p(T), the nondeterministic choice t \ u 
is synonymous with t U ft. For notational compactness, we alternatively allow 
either or both of the arguments of \ to range over T, e.g. where t £ T, the 
expression t \ u is equivalent to {t} | u. 

The ‘I’ operator allows the above equation to be expressed more compactly 
as follows: 

(0,1) A -(0,1) = (0,1) A (1,0) = (0)1(0, 1,0) 

Using the shorthand notation, this may equivalently be written as: 

io ^io = io To = Fo I Fi 
Definition 5. Letting X range over {T, F, "f, J,}, 

n 

Xra..n =' IJ {^i} ^aq...|a„ =' J ■ • ■ I 

i—m 

For example, Fq | Fi may equivalently be written as Fq^, and rather than fully 
enumerating a long list of alternate pulse counts of the form ^rn\m+i\...\n-i\m the 
preferred notation Fm..n may be used instead. These notations may be combined, 
e-g- Fo|3|5.,7|io..12- 

Nondeterministic choice obeys all the laws of set union, e.g. 

a|a = a a\h = h \ a a|(6|c) = (a|6)|c = a|6|c 

From this, various subscript laws follow, e.g. 







a = ^a 

V 

-^111111(0,0) 

^a..h\c..d 



Xa..a = Xa 

.max(fc,(i) if c b A a d, 
otherwise. 



Definition 6. It is convenient to name the following least upper bounds w.r.t. 

(P(T),C).- 



F* U T* U {T„} t* U i* =' U 

rieN neN neN neN 

★ 1^'F*UT*UT*Ui* 
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Table 2. Boolean Functions on Traces 
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where m > 0, n > 0. 



3.3 Galois Connection 

Definition 7. Given a deterministic concrete signal s G S, the abstraction func- 
tion /? : S ^ T returns the corresponding deterministic trace: 

(3s '=^ (s(— oo), s(fci), . . . , s{kn)) where {ki , . . . , kn} = If's 

= (s(— oo), ^s(— oo), s(— oo), ^s(— oo), . . . ) 

Note that (3s has exactly 1 + |S's| elements. 

Definition 8 . The abstraction function a : p(S) p(T) and concretisation 
function 7 : p(T) ^ p(S) are defined as follows: 

as {(3s I s G s} 7 t {s G S I /3s G t} 

Definition 9. Letting S x S ^ B represent the equivalence relation si ~ 
S 2 (3s\ = (3 s 2, the set S** S/~ is the set of equivalence classes in S with 

def 

respect to The set [s] = {s' G S | /3s = /3s'} represents, for any s G S, the 
equivalence class containing that element. 

Note that S** is isomorphic with T. 

Theorem 1. Together, the adjoint functions ( 0 , 7 ) form a Galois connection 
between p(S) and p(T). Following Gousot & Gousot [2], Theorem 5.3.0.j and 
Gorollary 5. 3. 0.5, pp. 273, it is sufficient to show that ao^fx) □ x and 700 ( 0 ;) □ 
X. We choose to prove instead the slightly stronger a o 7 ( 0 ;) = x, and since the 
ordering relations on p(S) and p(T) are subset inclusion, we write 3 rather than 
□ . Proof; letting x = {x\, ... ,Xn\ 

1. a o j(x) = ajs G S I /3s G a:} = |/3s' | s' G (s G S | /3s G x}} = |/3s' | (3s G 

x} = X. 

2. 7 oa(x) = 7 oa{xi, . . . ,x„j = 7 {/ 3 xi, . . . ,/3x„| = 7 {/ 3 xij U • • • U 7 (/ 3 x„} = 
|s G S I /3s = /3{xi}} U • • • U (s G S I /3s = /3{x„}} = [xi] U • • • U [x„] 3 
jxi, . . . ,x„| = X. 
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4 Circuits 

Definition 10. Circuits are modeled by composing four basic operators: zero 
delay ‘and’ A, zero delay ‘not’ transmission line delay A and inertial delay 
□ , which are defined on the concrete domain as follows: 

A A(si, S2 ).{At.si(t) a S 2 {t) I Si G Si a S2 G S2} 

^ As.{At.^s(t) I s G s} 

A‘^ j o a 

□ 7 o Qli o a 

Their abstract counterparts are defined as follows: 

a o A o (7, 7) 
q; o -1 o 7 
Xx.x 

\i.{t G T I G t. Valff) = Valft') A Subs{f) < Subsft')} 

where Val : T ^ {F, T, J,} and Subs : T ^ N are defined as follows: 

Val{Xn) X Subs{Xn) n 

And. The function A : p(S) x p(S) ^ p(S) represents a perfect zero-delay AND 
gate. Its abstract counterpart, A** : p(T) x p(T) ^ p(T), is defined in terms 
of A by composition with a and 7; note that our semantics is based upon an 
independent attribute model [10]. 

Not. The bijective function ^ : p(S) ^ p(S) represents a perfect zero delay 
NOT gate. As with A, we define : p(T) ^ p(T) by composition of the 
concrete operator ^ with a and 7. When tabulated, A** and behave as shown 
in Table 2 . 

Transmission Line (N on-inertial) Delay. Our definition of transmission line de- 
lay is essentially a superset of all possible delay functions that preserve the 
underlying trace structure of the signal. The definition, 700, captures this 
behaviour straightforwardly; the a function abstracts away all details of time, 
though preserves transitions and the values at the beginning and end of time, 
then 7 concretises this, resulting in the set of all possible traces with similar 
structure. This definition is more permissive than more typical notions of delay 
in that it includes negative as well as positive time shifts as well as transforma- 
tions that can stretch or compress (though not remove or reorder) pulses. 



Z\ti 

□ H (tef 
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Inertial Delay. Inertial delay is broadly similar to transmission line delay, in 
that, as well as changing the time at which transitions may occur, one or more 
complete pulses (i.e. pairs of adjacent transitions) may be removed. This models 
a common property of some physical components, whereby very short pulses are 
‘soaked up’ by internal capacitance and/or inductance and thereby not passed 
on. We model inertial delay in the abstract domain - in effect, nondeterministic 
traces are mapped to convex hulls of the form Fo,,o | To..t, | to..c I io..d- The 
concrete inertial delay operator □ is defined in terms of by composition with 
7 and a, so as with transmission line delay, it encompasses all possible (correct) 
inertial delay functions. It can be noted that, for all s G p(S), As C Ds. 



4.1 Soundness and Completeness 

An abstract function may be described as sound with respect to a concrete 
function / if all behaviours exhibited by / are within the set of possible be- 
haviours predicted by fK Where these sets are identical (i.e. where predicts 
all possible behaviours of /), completeness holds [6-8,5, 12], two forms of which 
are defined below. 

Definition 11. Given a concrete domain D and an abstract domain , related 
by adjoint functions (0,7) that form a Galois connection (i.e. 007(3;) □ x and 
7 o a{x) A x), a pair of functions f : D ^ D and f^ : D'^ ^ D'^ may be said to 
be sound iff the following (equivalent) relations hold: 

a o f G f^ o a /07E70/** 

Definition 12. Let fl^^^ a o / o 7. 

Definition 13. Where /•* = and /07 = 70/1*, the property 7-completeness 
holds. 

Definition 14. Where /•* = and aof = f'^oa, the property o-completeness 
holds. 

Note that o-completeness and 7-completeness are orthogonal properties; nei- 
ther implies the other, though if either or both kinds of completeness hold, 
soundness must also hold. 

Theorem 2. The transmission line delay operator {A, A^) is sound, a-complete 
and '^-complete. Proof: 

1. = a o A o ^ = 0070007 = 007 = (Xx.x) = A'^ . 

2. a-completeness: aoA = ao^oa=a = (Xx.x) o o = o o. 

3. ^-completeness: Ao'j = 'yoao'j = 'y = jo (Xx.x) = j o AK 

Theorem 3. The inertial delay operator (□, D^^) is sound, a-complete and 7- 
complete. Proof: 
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1 . = Q;ono7 = ao7on^oao7 = n^. 

2 . a-completeness: aon = ao7oDttoQ; = o a. 

3 . ^-completeness: □07 = 700# 0007 = 70 □#. 

Theorem 4 . The perfect NOT operator (=, =#) is sound, a-eomplete and 7- 
complete. Proof: 

Aest = ao = o7 = =#. 

2. Since is a bijection, 'joao-i = -tojoa. 

3 . a-eompleteness: ao-i = o;o7oao-i = ao->o7oa = =#oa. 

4- ^-eompleteness: = o7 = -i07oao7 = 7ooo-io7 = 7o-i#. 

Theorem 5 . The perfect AND operator (A, A#) is sound"^ . Proof: 

1 . A o (7, 7) C 7 o A# = 7 o a o A o (7, 7) . 

Note that whilst perfect, zero delay AND is sound but not complete, a com- 
posite speed-insensitive AND (As/ =^ Z\ o A, A^j Z\# o A#) can be straight- 
forwardly be shown to be 7-complete, but not o-complete. Dually, delay-inde- 
pendent AND {Adi Ao(Z\, A), A#o(Z\#, Z\#)) is a- but not 7-complete. 

We find, however, that {Acompiete '= AoAo{A, ^), Af,g„p,g(g Z\#oA#o(Z\#, Z\#)) 
is both a- and 7-complete. 



5 Finite Versions of the Abstract Domain 

The abstract domain defined in Section 3 allows arbitrary asynchronous combi- 
national circuits to be reasoned about. In this section we present a number of 
simplifications of this basic model which allow accuracy to be traded off against 
levels of abstraction. The model presented in Section 3 is useful in identifying 
possible glitches within circuits, though in this case generally one is interested 
in whether a particular signal ean glitch, rather than the number of possible 
glitches - this requires less information than that captured by our original ab- 
straction. It follows that further abstraction should be possible, which is indeed 
the case. 

5.1 Collapsing Non-zero Subscripts: 

The 256 - Value Transitional Logic T256 

Mapping all non-zero subscript traces t e Vi,,oo to the single abstract value 
for X ranging over {F,T, i}, makes it possible to define a finite abstract 

domain with a Galois connection to T. This domain has the desirable property 
of abstracting away details of ‘how glitchy’ a trace may be, whilst retaining the 
ability to distinguish clean traces from dirty traces. 

^ Note that we adopt an independent attribute model when considering the dyadic 
nature of AND. 
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Table 3. Operators on Tc 





1 AJ 


□c 


1 A 


Fo F+ To 


T+ To 


t-F io i-F 




To 


Fo 


Fo 


Fo 


Fo 


Fo 


0 

LL 

0 

LL 

0 

LL 


0 

LL 

0 

LL 


0 

LL 

0 

LL 

0 

LL 


F+ 


T+ 


F+ 


F+ 


F+ 


F? 


F+ 


Fo F7 F+ 


F? F? 


F? F? F? 


To 


Fo 


To 


To 


To 


To 


To 


Fo F+ To 


T+ To 


T+ io i+ 


T+ 


F+ 


T+ 


T+ 


T+ 


T, 


T+ 


Fo F7 T+ 


T+ F, 


T? i-F i? 


To 


To 


To 


To 


To 


To 


To 


Fo F7 To 


T 7 To 


T. F? F? 


T+ 


T+ 


T+ 


T+ 


T+ 


T? 


T+ 


Fo F7 T+ 


T7 T7 


T? F? F? 


io 


To 


To 


■io 


io 


io 


To 


Fo F7 To 


T+ F 7 


F? io i? 


i + 


T+ 


T+ 


i + 


i + 


i? 


T+ 


Fo F7 T+ 


T 7 F? 


F? i? i? 


where F? *1?:^ 


Fo 1 F 


+ 5 


-f. def 

1 ? = 


To 


T+, 


b = To 1 


i-|- ? 


T 7 = To 



Definition 15. The abstract domain o/ subscript-collapsed deterministic traces 

def 

is the set Tc = {Fq, F+, Tq, T+, |g, J,g, |_|_}. Following the usual convention, 
the corresponding abstract domain of subscript-collapsed nondeterministic traces 
is the set T256 p(Tc). 

Note that unlike T and p(T), both Tc and p(Tc) are finite sets, with 8 and 
256 members respectively. 

Definition 16. The Galois connection Oc : p(T) ^ p(Tc), 7 c : p(Tc) ^ p(T) 
is defined as follows: 

otherwise. 

aj {Pet \t €t} 7ct {t eT \ Pet G i} 

It is possible to tabulate 256 x 256 truth tables that fully enumerate all 
members of T 256 along their edges, but they are too large to reproduce here in 
full. For brevity. Table 3 defines the operators ->c : Tc ^ p(Tc), Z\c : Tc ^ p(Tc), 
□c : Tc ^ p(Tc) and Ac : Tc x Tc ^ p(Tc) on Tc. Their fully nondeterministic 
versions, defined on p(Tc), are as follows: 

^ct |J{Z\ct} Oei |J{nct} t Ac M y {t Ac U} 

tGt t£t t£t t£t 

u^u 

Note that, as with the Z\c operator is merely an identity function. 



6 Further Simplification of the Abstract Domain 

A fully tabulated version of the -ic, Ae, Dc and Ac operators defined in Section 
5.1 can be regarded as a 256- value transitional logic, where the values are the 
members of p(Tc). Such an approach still captures more nondeterminism than 
is useful in for many applications. It is possible to further reduce the abstract 
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P(S) 





u 



Fig. 2. Hierarchy of Domains 



domain, replacing some nondeterministic choices with appropriate least upper 
bound elements with respect to (p(Tc),C). The hierarchy of domains that re- 
sults is shown in Fig. 2 - the relationship to 2-value Boolean logic B and 3- value 
ternary logic B 3 is shown®. Note that since B lacks an upper bound that cor- 
responds with ir, it is not possible to define a : B 3 ^ B (though 7 : B — > B 3 
can be trivially defined), so a Galois connection does not exist in that particular 
case. Following Cousot & Cousot [1,2], the domain U, useless logic, containing 
only ★, completes the lattice. 

Finding the smallest lattice including {Fq, F+, Tq, T+, |g, |q, |_|_} that is 

closed under Ac, and results in the 13-value transitional logic, 

Ti 3 =' {Fo, F+, Fv, To, T+, T,, To, T+, T?, To, i+, i?, ★} 

Though much smaller than p(Tc), this logic is equivalently useful for most pur- 
poses - note that a special element needs to be explicitly included, ★, repre- 
senting the least upper bound (top element) of the lattice. 

In cases where it is important to know that a trace is definitely clean, but 
where it is not necessary to distinguish between ‘definitely dirty’ and ‘possibly 
dirty’, further reducing the domain by folding F+, T+, |+ and T+ into their 
respective least upper bounds F?, T?, T? and T? results in a 9- value transitional 

def 

logic, Tg = {Fo, F?, To, T?, {g, {?, To, T?, An even simpler 5-value transitional 
logic T 5 =^{F,T, T, i, 'A'} results from folding all remaining nondeterminism into 
★ . Ti 3 and Tg are well suited to logic simulation, refinement and model checking, 
whereas T5 is only recommended for glitch checking. 



6.1 Static/Clean Logics 

The Ti 3, Tg and T5 logics can be usefully extended by introducing two extra 
upper bounds: S, the least upper bound of traces whose values are fixed for all 
time, and C, the least upper bound of traces that may transition, but that never 
glitch. 

® As with our other logics, we assume that F C and T C - some ternary logics 
in the literature (notably Kleene’s) lack this formal requirement. 
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Definition 17. With respect to p(Tc), the least upper bounds S, C and ★ are 
defined as follows: 




{Fo,To} 



C ='{Fo,To,To.W 



The resulting static/clean transitional logics T15 = T13 U {S, C}, Tn = 

def 

Tg U {S, C} and T7 = T5 U {S, C} have applications in the design rule checking 
of ‘impure’ synchronous circuits. For example, a gated clock input represented 
by S A C = C might be accepted by a model checker, but C A C = ★ would not. 

Removing | and J, from T7 results in a 5 - value static/clean logic T5 = 
{F,T, SjCj-A"} capable of reasoning about gated clock synchronous circuits; an 

even simpler (though less accurate) 3 -value static/clean logic T3 {S,C,^} 
results from also removing F and T. 



7 Refinement and Equivalence in Transitional Logics 

Hardware engineers frequently concern themselves with modification and op- 
timisation of existing circuits, so it is appropriate to support this by defining 
equivalence and refinement with respect to our abstract domains. Refinement 
relationships between circuits are analogous to concepts of refinement in process 
calculi, and may similarly be used to aid provably correct design. For example, 
the Boolean equivalence aA^a = F is not a strong equivalence in our model, nor 
is it a weak equivalence - it actually turns out to be a (left-to-right) refinement, 
i.e. aA^a ^ Fg, reflecting the ‘engineer’s intuition’ that it is safe to replace aA^a 
with Fq, but that the converse could damage the functionality of the circuit by 
introducing new glitch states that were not present in the original design. 

Informally, if the deterministic trace m S T refines (i.e. retains the steady 
state behaviour of, but is no more glitchy than) trace t € T, this may be denoted 
t)pu. 

Definition 18. Given a pair of traces t € T and u G T, 

dcf 

t )p u = Val{f) = Val{u) A Subs{f) > Subs{u) 

For example, Fi Fg, T3 ^ Tg, ts ^ Tsi but J,g and |i are incomparable. 
Where t G T and u G T, if t ^ u and m ^ t, it follows that t = u. 

Refinement and equivalence for nondeterministic traces is slightly less 
straightforward, in that it is necessary to handle cases like ii|3|5 ^ ig|2|4- To 
make these comparable, we construct convex hulls of the form Aio. .„ enclosing the 
nondeterministic choices, so the above case becomes equivalent to J,g 5 ^ io.,4. 
In effect, this approach compares worst-case behaviour, disregarding finer de- 
tail; in practice, since A, A, □ and ^ typically return results of the general form 
Afg .n anyway, this tends not to cause any practical difficulties. Less permissive 




194 Sarah Thompson and Alan Mycroft 



definitions of refinement, e.g. t strict u = yt € t . \/u € it . t u, often disal- 
low too many possible optimisations that in practice are quite acceptable - our 
model better reflects the engineer’s intuition that ‘less glitchy is better,’ but that 
very detailed information about the structure of possible glitches is generally not 
important. 

Definition 19 . Where i S p(T) and u G p(T), 

i)p it = (yt G i,u € it . Val(t) = Val{u)) A MaxSubs{t) > MaxSubs{ii) 

where MaxSubs{t) max^g^ is a function returning the largest subscript 

of a nondeterministic trace. 



Equivalence of Non- deterministic Traces. Where t G p(T) and ii G p(T), ifi = u 
then the traces are strongly equivalent, i.e. they represent exactly the same sets 
of nondeterministic choices. If the convex hulls surrounding t and m are identical, 
as is the case when t ii/\u t, the traces may be said to be weakly equivalent, 
denoted t — it. Where t uM it t, the traces are comparable, denoted t o it. 



Finite Abstract Domains. Refinement and equivalence can also be defined for 
the finite abstract domain T 256 and some of its simplified forms. Since T 256 is 
implicitly nondeterministic, we do not need to consider the deterministic case. 

Definition 20 . Given traces t G T256 and u G T256, 
d©f 

t u = Valft) = Val{u) A (Subs{t) = Subs{u) V Subs{u) = 0) 

def 

t — U = t)pu/\u)pt = t= u 
d©f 

t o u = t u\/ u ip t = Valff) = Valfu) 

8 Related Work 

There seems to be relatively little work reported in the literature regarding the 
application of modern program analysis techniques to hardware. 

Don Gaubatz [4] proposes a 4-value ‘quaternary’ logic that bears some re- 
semblance to our 5- value transitional logic. 

Paul Cunningham [3] extends Gaubatz’s work in many respects, though his 
formalism is based on a conventional 2-value logic with transitions handled ex- 
plicitly as events rather than as values in an extended logic. 

Charles Hymans [9] uses abstract interpretation to present a safety prop- 
erty checking technique based upon abstract interpretation of (synchronous) 
behavioural VHDL specifications. 
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9 Conclusions 

In this paper, we have presented a technique based upon the solid foundation 
of abstract interpretation [1, 2] that allows properties of a wide class of digital 
circuits to be reasoned about. We describe what is essentially a first attempt at 
applying abstract interpretation to asynchronous hardware - clearly more can 
be done, particularly in exploring completeness. 

9.1 Future Work 

In Section 7, we define refinement and equivalence relations on circuits. It ap- 
pears to be possible to generalise this definition of refinement and equivalence to 
any abstract domain that is itself amenable to abstract interpretation. We have 
already demonstrated that our technique is potentially useful for logic simulation 
[13] - implementing a demonstrable simulator is a logical next step. 

An experimental proof system exists for the 11- value clean/static transitional 
logic, and we hope to extend this to cover the more general case, p(T). 
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Abstract. Several authors have advocated the use of the gated data 
dependence graph as a compiler intermediate representation. If this rep- 
resentation is to gain acceptance, it is important to show that we may 
construct static analyses which operate directly on it. In this paper 
we present the first example of such an analysis, developed using the 
methodology of abstract interpretation. The analysis is shown to be 
sound with respect to a concrete semantics for the representation. Exper- 
imental results are presented which indicate that the analysis performs 
well in comparison to conventional techniques. 



1 Introduction 

The classical intermediate representation for optimizing compilation is the Con- 
trol Flow Graph. Static analysis techniques may be applied to a CFG to deduce 
properties which will hold at runtime, and this information may then be used 
to direct program transformations. The methodology of abstract interpretation 
[4] is often used to construct an analysis, or to demonstrate that an existing 
analysis is sound. 

The cost of maintaining auxilliary data structures associated with the CFG, 
and the increasing need to extract parallelism from sequential code, has led 
several authors to propose alternative program respresentations, such as the 
Program Dependence Graph [8] and gated Data Dependence Graph [3]. If these 
representations are to be used in practical compilers, it is important to demon- 
strate that they are amenable to static analysis. 

In this paper we describe the design of a static analysis for gated DDGs, using 
the methodology of abstract interpretation. A concrete semantics is presented, 
and the soundness of the analysis with respect to this semantics is shown. The 
analysis may be instantiated with any numerical domain which satisfies certain 
straightforward conditions, permitting a tradeoff between cost and the accuracy 
of the results. 

* This research was supported in part by the Reseau National de recherche et 
d’innovation en Technologies Logicielles (RNTL). 
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The next section describes gated DDGs, and presents an informal execution 
model. Section 3 presents a rule-based concrete semantics and equivalent fixpoint 
(collecting) semantics. Section 4 presents our abstract semantics, which is shown 
to be sound. Section 5 discusses the performance of our static analysis and section 
6 gives conclusions and describes our ongoing work. 



2 Gated Data Dependence Graphs 

Gated DDGs were introduced in [3] as part of the Program Dependence Web 
intermediate representation. They may be generated from imperative programs 
using one of several techniques, including SESE-analysis and symbolic execution 
[10], calculation of path expressions [9] and syntax directed translation [6]. 

Nodes represent operations and edges represent data dependencies (i.e. they 
run from value consumers to value producers). Edges connect to nodes via ports. 



Unconditional Execution. Every gated DDG has a unique in node, which 
provides an output port for each formal parameter, and a unique out node which 
provides an input port for each return value. Arithmetic nodes have either one 
or two input ports, and one output port. Figure 1 shows the graph for a simple 
function with no conditional statements. The pale grey edges represent state 
dependencies, described below. 



int func(int a, int 6) 

{ 

return {a + b) * {a — b)-, 

} 







E 


□ 







□ 




1 1 



Fig. 1. Gated DDG for function with no conditional statements 



Evaluation of a function involves initializing the outputs of the in node with 
the values of the parameters, and evaluating the inputs of the out node. 



Conditional Execution. Gonditional constructs are implemented using 7 
nodes, which have three input ports. One of two value inputs is assigned to 
the single output port depending on whether a predicate input is zero. Figure 2 
shows the graph for the function max, which returns the larger of two values. 
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int max(int a, int 6) 

{ 

if (a > b) 

return o; 
else 

return &; 




Fig. 2. Gated DDG for function with conditional statement 



Iteration. We use the GSA form of gated DDG [3], which represents iteration 
using polyadic fi loop header and rj loop exit nodes. Our fi nodes have one initial 
input, one next input and one output for each loop variant. Our -q nodes have 
a predicate input, and one input and one output for each loop result. Figure 3 
shows the graph for an iterative factorial function. 



int fact(int a) 

{ 

int b := 1; 
do { 

b := b* a-, 
a := a — 1; 
} while (a) 
return 6; 

} 




Fig. 3. Gated DDG for function with loop statement 



Evaluation of an q node involves first initializing the outputs of the corre- 
sponding q node with values from the initial inputs and then repeatedly reini- 
tializing with values from the next inputs until the predicate is equal to zero. 
The inputs of the q node may then be evaluated and assigned to their respective 
outputs. 

Other forms of gated DDG differ in the notation used to represent iteration. 
The Value State Dependence Graph [6] uses 9 loop nodes, and the Value De- 
pendence Graph [10] models iteration as tail recursion. Both forms share the 
same execution model as GSA form however, and we anticipate that the results 
presented here may be readily extended to them. 
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State. We follow the convention of representing memory as a mapping from 
locations to values. Every in node provides an output port for the initial state, 
and every out node provides an input port for the final state. Update operations 
consume a state object and produce a new state object. State dependence edges 
(shown in light grey in the examples above), ensure necessary serialization of 
access to memory. 

3 Concrete Semantics 

In the subsequent discussion, we will restrict ourselves to functions which have 
a single argument, and in which only a single value varies in any given loop. 
This greatly simplifies our notation, as the result of evaluating a node is a single 
value (rather than a tuple) . The reintroduction of polyadic in, fi and rj nodes is 
straightforward. 

Figure 4 presents the rule-based concrete semantics of gated DDGs. Judge- 
ments take the form p \-y x [vx, indicating that node x has value Vx in envi- 
ronment p, and p \~E {Pt i^x{i,n)) [ p' , indicating that p' is the first environment 
reachable from p by repeated evaluation of a loop body which makes p false. An 
environment is a mapping from nodes to values for the in node and the p, nodes 
of all enclosing loops. The judgements p \~v x I false and p hy xltrue indicate 
that node x has zero and non-zero value respectively. 



-CONST 



p \~v const {v) [v 
p\-vx[Vx V = unop(u2:) 



UNARY 



p \~v unop{x) Iv 
p\~vxlvx P^vylvy u = binop(v2:,-yj,) 
p \~v binop{x,y) Iv 



BINARY 



P by pi true p \~V t[v 
p by 7(p,t,/) fu 



TRUE 



P by pjfalse p by / fu 
P by 'y{p,t,f)iv 



FALSE 



p by inip{in) 
p \~v i I Vi 



p by pi true 



LOOKUP-IN ; ; — — ; -LOOKUP-^ 

p by px{t,n) ip[px) 

p[px ^Vi] bg {p,px{i,n))jp' p' by 

P by ri{p,r,px{i,n)) iVr 
P by pi false 
P \-E (p,px(i,n))ip 

p by niu„ p[px ^ Vn] \-E {p,px{i,n))ip\^^^^^ 

j T T TT — I IN t/A. i 

p bB [p, Px{t,n)) i p' 



Fig. 4. Rule-based semantics 



The first three rules correspond to the standard semantics of arithmetic ex- 
pressions. The rules true and false give the meaning of a 7 node for non-zero 
and zero predicate values respectively. The rules lookup-in and lookup-/^ re- 
trieve the value for a particular node from the current environment. 
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The rule init gives the meaning of an 77 node. The environment is first ex- 
tended to map the corresponding fi node to the value of its initial input. The 
rules NEXT and end then generate successive environments until one is found in 
which the predicate is false. Finally, the value of the result input is computed in 
this environment. 

As a basis for proving that an abstract interpretation is sound, it is helpful 
to recast the rule-based semantics of figure 4 in fixpoint form. First, observe that 
substituting the rules in figure 5 for the rules init, next and end does not alter 
the meaning of a /i node. 



p\-yilvi p[na: ^ Vj]\-E ip' p' \-y pj false p' \~v r [Vr 

p hy p{p,r,p^{i,n)) Ivr 



p \-E {p,Pa:ii,n))ip™^ 

p\-E {p,Px(i,n))lp' p'\-yp[true p' bv , 

1 T T TT — : : IN rhA. i 

P ^E {p,Px{l,n))ip'\p:c ^ Vr^] 



Fig. 5. Alternative rule-based semantics for iteration 



In this case, the rules next' and end' generate all environments reachable 
from the initial environment by a sequence of transitions in which every inter- 
mediate environment makes the predicate true. The rule init' selects the one 
environment for which the predicate is false. 

By defining |n] p = {v \ p by n[v} and expressing the set of environments 
generated by next' and end' as the result of a fixpoint operator, we obtain the 
semantics shown in figure 6. 

We have |n] : £ — > p{V) and (\p, p.x{i,n)\) : £ p{£), where £ is the set of 

possible environments, and V is the set of possible values. Note that as gated 
DDGs are Turing powerful, the fixpoint is not effectively computable. 

4 Abstract Interpretation 

The goal of static analysis is to determine whether a particular assertion holds 
for all possible executions of a program. In the context of gated DDGs these 
assertions refer to the values returned by nodes. An analyzer must therefore be 
able to compute a conservative estimate of the set of all possible return values. 

As the fixpoint in figure 6 is not effectively computable, we adopt the method- 
ology of abstract interpretation to obtain a sound, decidable approximation of 
the concrete semantics. 

4.1 Abstract Domains 

Abstract domains Vg and Df:xv must be chosen to represent elements of p{£) 
and p{£ x V). Each domain is a complete partial order (T>, C,T,U). The bot- 
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[const(n)] p = {«} 

\unop{x)\ p = {unop(ni,) | Vx € [x| p} 

[hinop{x,y)\ p = {binop(n^, Ua) | Ux G [a;] p A Wy G [j/] p} 

Im]p = {p{in)} 

[P:r(*,n)lp = {p{pa:)} 

b(p, U /)I P= {v\ true G [p1 P A w G [tj p} U 
{v I false G [p1 p A n G [/] p} 

lv{p,r,Px{i,n))jp = {vr | Vi G H pA p' G (|p, (i, n)|) p[p^ ^ Vi] A 

false G [pI p' A Vr G [r] p'} 
dp,p:,(i,n)|)p = Ifp^F 

where 

F(X) = {p} U {p'[px ^ v„]\ p' G X A true G [pj p' A v„ G |n] p'} 
Fig. 6. Fixpoint semantics 



tom _L provides an abstraction of the empty set, and the join U computes an 
upper-bound of two elements. The meanings of elements are given by monotonic 
concretization functions 75 : T>£ p{£) and jsxv ■ T^Sxv p{^ x V). 

We require a number of primitives, lookup : T>£ — > 'Dgxv reads information 
about a loop variable, release : T>£xv 'Dsxv removes information about a 
loop variable, assign : T>£xv T^£ writes information about a loop variable, and 
select : T>£ T>£ asserts a boolean condition. These must obey the following 

soundness conditions: 



{(p, v)\ pG -f£{X) Av = p{x)} C 7£xv(lookup,^(X)) 
{{p\u) I (p,u) G j£xv{X) AVy yf x : p'{y) = p{y)} C 7£xv(release„(X)) 
{p[x ^ v] I (p,v) G 'y£xv{X)} C 7£(assign,^(X)) 

{p \ pGj£{X)Av G |p 1 p} C 7£(selectp=„(X)) . 



The concretization of the result of lookup must contain each environment in that 
of its operand, paired with the retrieved value. The concretization of the result 
of release must contain each environment-value pair in that of its operand, pre- 
serving information about all but one value. The concretization of the result of 
assign must contain each environment in that of its operand, updated appropri- 
ately. The concretization of the result of select must contain each environment 
in that of its operand for which the condition holds. 

We also require a primitive constant : T>£ TP£xv which builds the rep- 
resentation of a constant, and abstract equivalents of the concrete arithmetic 
operators, unopl^ : T>£xv T^Sxv and binop** : T>£xv x T>£xv T^Sxv- These 
must obey the following soundness conditions: 
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{{p,v) I p G ^e{X)} C 7£xv(constant„(X)) 
{(p, unop(x)) I {p,x) G 7 £xv(-’^)} C 7£xv(unop‘*(X)) 



(p, binop(x,p)) 






The concretization of the result of constant must contain each environment in 
that of its operand, paired with the constant value. The concretizations of the 
results of unop** and binop** must contain the results of performing the corre- 
sponding concrete operations on those of their operands. 

If T>s possesses infinitely increasing chains, we also require a widening oper- 
ator V : T>s X T>s — > T>s, which must obey the usual soundness conditions. 



4.2 Abstract Semantics 

We define the abstract semantics in terms of the primitive operations on the 
abstract domains, as shown in figure 7. 



[const (v)]^ P = constant„(P) 
lunop{x)}^ P = unop**([a;]’*P) 

[binop{x,y)f P = binop"(|a;]‘'P, {yfP) 

[in{i,n)fP = lookup,„(P) 
lp^{i,n)fP = lookup^^(P) 

[t]'*selectp=t™e(P) U^xv [/]**selectp=/„i,e(P) 
lp{p,r,p^{i,n))fP = release^J[r]*selectp^/„i,e((|p,p„(i,n)P'*assign^^([i]'*P))) 
dp,P(i,n)[)'*P = limvF'* 

where 

F**(X) = PUs assign ^^(InJ^selectp^friiedA)) 
and limvF** denotes the limit of the abstract iteration sequence with widening 

F^ = T* 

f1+i =FlVF“(Fi) 

Fig. 7. Abstract semantics 

We have |n]** : T>s ^ Pexv and (\p, px(i,n)\)^ : T>s ^ T>£. If T>£ does not 
possess infinitely increasing chains, we may omit the widening operator V. 

The rules for const, unop and binop nodes are straightforward. The rule 
for 7 nodes uses select to create a pair of abstract environments in which the 
predicate is true and false, evaluates the nodes t and / in these environments, 
and combines the results using the join operator. 

The rule for p nodes first uses assign to create the initial abstract environ- 
ment. An approximation of the set of reachable environments is then computed 
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as the least fixpoint of a semantic transformer, which uses select to create an 
abstract environment in which the loop predicate is true and assign to update 
it. select is used to create an abstract environment in which the loop predicate 
is false and r is evaluated in this environment. Finally release is used to remove 
unnecessary information about the loop variable. 

4.3 Soundness 

The concretization function 7 : {Vg 'Dgy.v) {S ^ p(b’)), which gives the 
concrete meaning of the abstract semantics, may be defined in terms of 75 and 
7£xv as follows: 

7(H“) = Ap.{v I VP : p e 7 £(P) {p,v) € 7£xv(M‘*P)} . 

The result of 7 is a function which maps an environment p to a set of values. 
A value v is in the set if applying the abstract semantics to any element of T>s 
whose concretization contains p yields an element of 'Dgxv whose concretization 
contains (p, u). 

Our abstract semantics is sound, that is for all environments, the set of 
values computed by the concretization of the abstract semantics contains the set 
of values computed by the concrete semantics. 

Theorem 1 (Soundness). For any node n, we have: 

Vp : H(p) C 7(|n]“)(p) . ( 1 ) 

Proof. Using the definition of 7, we have that relation 1 is equivalent to 
Vp : Vu : V G |n](p) ^ (VP : p G 7f:(P) ^ (p, u) G 'YexvilnfP)) 
which is equivalent to 

VP : Vp : Vu : u G |n](p) Ap G 7f:(P) ^ (p,v) G jSxvilnfP) 

and to 

VP : {(p,u) I p G j£{P) Av G M(p)} C -fSxvilnfP) ■ ( 2 ) 

We prove 2 by induction on the depth of the graph. 

— n = const{c). {(p, c) | p G As{P)} U v(constantc(P)) follows immediately 
from the soundness condition on constant. 

— n = unop{x). This case is similar to binop{x,y), below. 

— n = binop{x,y). By the induction hypothesis, we have 

G |x]pA p G 1£{P) (p,Vx) G ASxvilxfP) 

Vy G lyjpAp G 7£(P) ^ (p,Vy) G ASxvilyfP) 

and therefore 



{{p,h\r\op{v^,Vy)) I p G 7£(P) Av^& lx\pAVy G |plp} 

C {(p, h\r\op{v^,Vy)) I (p,u„) G ASxvilxfP) A (p,Vy) G ASxvilyfP)} ■ 
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By the soundness condition on binop**, this is 

C 7£xv(binop“(|a;]‘*P, {yfP)) = 'ysxvilbiriop{x,y)f P) . 

— n = in{i,n). This case is identical to y,x{i,n), below. 

— n = Hx{i,n). {(p,u) I p G -fs{P) ^v = p{px)} C v(lookup^^ (P)) follows 
immediately from the soundness condition on lookup. 

— n = 7 (p, t, /). By the soundness condition on select, we have 

p G j£{P) Afalse G fpjp => p & 7f:(selectp=/oise(P)) 
and therefore 



{{p,v)\p& 7£(P) A false G |plp A v G Ifjp} 

C {{p,v) I p G 7 f:(selectp=/aise(P)) A v G |/]p} 

which by the induction hypothesis on / is 

C yfrxvd/l^selectp =false{P)) ■ 

By similar reasoning, we also have 

{{p,v) I p G 7f:(P) A true G MpAv G |t]/j} C y^x v(W‘*selectp=t™e(P)) 
and therefore 

{{p,v)\ pG 7£(P) A tf{p,t,f)jp} 

C 7£xv([f]‘*selectp=t™e(P)) U 7£xv([/]‘*selectp=/oise(P)) • 

As Uf:xv is a sound approximation of the U operator, this is 

c j£xv{ltfse\ectp^true{P)Usxvlffse\ectp^faise{P))=lSxv{h{p,t,f)fP). 

— n = p{p, r, px{h IT'))' This case is proved later by proposition 1. 

To prove the soundness of the rule for p nodes, we need to establish the 
following subsidiary lemmas. 

Lemma 1. Given concrete and abstract domains (P, C,T,U) and 
(pH, cH, T#, ud, concretization function ^ : D'^ ^ D, concrete and abstract 
semantic transformers ¥ : D ^ D and FH : pH ^ and widening operator 
V : pH X pH — !■ pH such that: 

1. 7 monotonic 

2. F continuous 

3. VA G pH : F(7(A)) C 7(FH(A)) 

I VA, r G pH : A CH XVA, Y cH XVF 
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5 . For every sequence of elements of , the sequence (yn)rieN defined 

as 



yo = xo 

Vn+l — yri^ Xu+1 



is ultimately stationary, 

then there exists some N after which the abstract iteration sequence with widen- 
ing 

F* = _L“ 

Fi+i=FlVF»(Fi) 

converges, and lfpj_F C 7 (f5 ^). 

Proof. We prove by induction on n that Vn G N : F„ C 7(Fj!j), where F„ is 
defined as: 



Fo = F 

F„+i = F(F„) . 



- Fo = ± C 7 (F«) = 7 (±#) 

- Assuming F„ C 7(F|), since F is continuous we have F„_|_i = F(F„) C 
F(7(F5 j)). From condition 3 we have F(7(F5 j)) C 7(F**(F5j)) and from con- 
dition 4 we have Ft*(Fj!j) C** F 5 jVF**(FJ!j). Since 7 is monotone, we have 
7(Ftt(Fjj)) C 7(F5 jVF1*(F5j)). We have thus established that F„_|_i C 7 (fJj_,_;^). 

The sequence (Fjj)„gN is increasing (condition 4 ), and converges (condition 5 ). 
There must therefore exist an N such that Vn > 0 : Fj^ F^^. From the 
above we have Vn > 0 : F„ C 7(f 5^), and hence Un>o®'" — 7 (f 5 v)- Since F is 
continuous, by Kleene’s theorem [ 7 ], Un>o®'" exactly Ifpj^F. 

Lemma 2. Assuming that |n]** is correct, that is 

VP : Vp : Vu : p G IsiP) Av G |n]p ^ (p,v) G 7£xv(H*‘^’) 

and given the concrete and abstract semantic transformers 

F(A) = {p} U {p'[px ^ Vn] \ p' & X A true G |p]p' A G |n]p'} 

F*(A) = P\j£ assign^^ (|n]**select 

p—true (X)) 

and given that p G As{P) then the following inclusion holds: 

F(7£(A)) C7£(F»(A)). 

Proof. 



F(7£(-A)) = {p} U {p[px ^ v„] I p G As{X) A true G |p]p' A G |n]p'} . 
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We know that p G ls{P) by hypothesis. By the soundness condition on select 
we have 



{p'[^J‘x ^ Vn] I p' G IsiX) A true G {pjp' Injp'} 

C {p'[px ^ Vn] I p' G 7f:(selectj,=t™e(-’^)) A v„ G |n]p'} . 

By hypothesis on |n]** this is 

c {p'[px ^ Vn] I (p',Vn) G 7 £xv(H“selectp=t™e(X))} 
and by the soundness condition on assign this is 

C 7£(assign^^(|n]‘‘selectp=t™eW)) = 7f:(®’“(^)) • 

Lemma 3. Assuming that |n]** is correct, that is 

VP : Vp : Vn : p G j£{P) A n G |n]p ^ (p,n) G 
and given that p G ^s{P) then the following inclusion holds: 
dp,Pa;(An)[)p C 7£((|p, Px{i,n)\)'^P) . 

Proof. We apply lemma 1 for 

F(A1) = {p} U {p[px ^ Vn] ] p & X A true G |p]p' A w„ G |n]p'} 

F#(X) = PUg assign^dM*’selectp=t™e(-V)) . 

Monotonicity of 75 and contuinity of F are easily checked. Since p' G 
7 f:(assign^^ (|t]l*P)) and by induction hypothesis on |n]** we know by lemma 
2 that 

F( 7 £W)C 7 £(F«(X)) 

SO 

(]p,Pa;(i,n)l)p= IfpgP c = 7£(dp,Pa;(An)[)“‘P) . 

We may now prove the soundness of the rule for p nodes. 

Proposition 1. The rule for p nodes is sound, i.e. 

VP : {(p,n) I p G 7 £(P)An G |p(p, r, n))]p} C -fsxv{ln{.P,r, Px{i,n))f{P)) . 

Proof. Using the definition of |p(p, r, px{i, the left hand side is 

= {{p,Vr) I p G je{P) A n* G PlpA p" = p[px ^ i] A p = (\p, px{i,n)\)p" A 
false G |p]p' AVr € |r]p'} . 

It is straightforward to verify by induction that Vp^ yf px ■ p'{py) = p{py): as 
p" is p with px modified, and (\p, px{h'n)\)p" only modifies px- Therefore this is 

= {{p,Vr) I p G 7 £(P) A n* G PlpAp" = p[px ^ i] A p' = (\p, px{i,n)\)p” A 
false G |p]p' AVr & |r]p' A ^Py yf px : p'{py) = p{py)} ■ 
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By the induction hypothesis on this is 

C {{p,Vr) I {p,Vi) G jsxvilifp) Ap" = p[px ^i]Ap' = (\p,px{i,n)\)p" A 
false G IpIp' AVr& Irjp' A ^Py ^ px ■ p'{py) = p{Py)} 

which by the soundness condition on assign is 

C {{p,Vr) I p” G 7 f:(assign^^(|i]‘*P)) A p' = l\p, px{i,n)\)p'' A false G |plp' A 
Vr G Irjp' A \/py px : p' {py) = p{py)} ■ 

By lemma 3 (correctness of (|p, pa;(A n)[)), this is 

C {{p,Vr) I p' G 'ysjljp, Px{i,n)\)^ ass\gn^^{lij^ P)) A false G [pip' A 
Vr G Irjp' A ypy yf px : p'{py) = p{py)} 

which by the soundness condition on select is 

C {{p,Vr) I p' G 7f:(selectp=/aise(|p,Pa:(An)D‘*assign^^(|t]‘*P)) A 
Vr G Irjp' A ypy yf Px : p'{py) = p{py)} ■ 

By the induction hypothesis on r, this is 

C {{p,Vr) I {p',Vr) G 7£xv(W“selectp=/ai.e(|p,Pa:(i,n)D“assign^^([t]‘‘P)) A 

Vpy ^ Px '■ p {py) = p{py)} ■ 

Finally, by the soundness condition on release, this is 

C 7£xv(releasepJ|r]“selectp=/ai^eIp,Pa;(An)]‘*assign^^(|t]“P))) . 



5 Results 

We have developed an implementation of our analysis which uses the domains 
of convex polyhedra of dimension N and + 1 (denoted by Pn and Pn+i) as 
T>s and Psxv respectively. Empty polyhedra correspond to _L, and the convex 
hull operation corresponds to U. As Pm possesses infinitely increasing chains, we 
require a widening operator. 

We use the Parma Polyhedra Library [1] to provide basic operations on poly- 
hedra. To improve the accuracy of our results, we use the improved widening 
operator described in [2] and widening with tokens (a form of delayed widening). 
The primitives lookup, assign and select are defined as follows. 

lookup,„(P) Add a dimension to P to produce an element of Pn+i- Constrain 
the new dimension to equal the dimension which corresponds to x. 
releasea;(P) Discard the dimension which corresponds to x, and add a new un- 
constrained dimension. 

assign,j,(P) Exchange the N + 1*^ dimension with the dimension which corre- 
sponds to X, and discard the new N + 1*^ dimension. 
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void bubblesort(int *k, int n) 

{ 

int b n; 
while(6 > 1) { 

int j := 1, t := 0; 
while(ji < (6 — 1)) { 
if (k[j] > k[j + 1]) { 

EXGHANGE(fc, j, j + 1); 

t ■- j-, 

} 

j ■■= j + 1; 

} 

1 h:=t\ 



void f(int i) 

{ 

int j ~ i * i; 
if(i = 1) { 
int k '■= j; 

II use k 

} 

} 



Fig. 8. Bubblesort routine 



Fig. 9. Routine benefitting from 
gated-DDG analysis 



selectp=„(P) Examine the predicate p = v to determine the additional restraints 

which it implies, expressed as a polyhedron Q G Pn- Form the intersection 

of P and Q. 

It is straightforward to demonstrate that these primitives satisfy their re- 
spective soundness conditions. 

The primitives constant, unop** and binop** use the PPL affine transformations 
for negation, addition, subtraction and multiplication by a constant, and return 
an unknown value for other operations. 

We have applied our implementation to the graphs for a variety of programs, 
and have found that the results compare well with those obtained by direct 
abstract interpretation of the programs themselves. As an example, consider the 
bubblesort routine in figure 8, which is taken from [5]. 

Our implementation was able to deduce exactly the same set of restraints 
listed by the original paper. In particular, it found that the following set of 
restraints holds at the start of body of the inner loop. 

b <n, t>0, t+ I < j, j + 1 <b 

The main advantage of our analysis over CFG-based analysis lies in our 
ability to exploit information at each use of a value, rather than at its definition. 
Consider the routine in figure 9. 

An analysis of the CFG using convex polyhedra to represent abstract en- 
vironments would be unable to deduce restraints for j, and hence for k. Our 
analysis can make use of information from the predicate i = 1 to deduce that 
/c = 1. It is possible to obtain this effect in a CFG-based analysis by inlining the 
definition of each variable into each of its uses, but this can easily lead to an 
exponential increase in code size. 
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6 Conclusions and Further Work 

We have presented a static analysis, developed using the methodology of abstract 
interpretation, which operates directly on gated DDGs. This is an important step 
in demonstrating their usefulness as a compiler intermediate representation. In 
terms of precision, our analysis compares very favourably with traditional CFG- 
based techniques. 

We intend to extend our implementation to represent the relationship be- 
tween values on consecutive loop iterations. This straightforward modification, 
which will involve allocating two dimensions to each value instead of one, will 
provide information useful for reasoning about loop termination. 

Our eventual goal is to produce an analyzer which can form part of a DDG- 
based optimizing compiler. To accomplish this, we will need to improve the 
efficiency of our implementation. At present, we compute the return value of a 
node each time it is needed, which can incur a substantial performance penalty. 
By introducing a caching scheme and reusing previously computed values, we 
will be able to eliminate this problem without sacrificing accuracy. 
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Abstract. We describe a polynomial-time algorithm for global value 
numbering, which is the problem of discovering equivalences among pro- 
gram sub-expressions. We treat all conditionals as non-deterministic and 
all program operators as uninterpreted. We show that there are pro- 
grams for which the set of all equivalences contains terms whose value 
graph representation requires exponential size. Our algorithm discovers 
all equivalences among terms of size at most s in time that grows linearly 
with s. For global value numbering, it suffices to choose s to be the size 
of the program. Earlier deterministic algorithms for the same problem 
are either incomplete or take exponential time. 



1 Introduction 

Detecting equivalence of program sub-expressions has a variety of applications. 
Gompilers use this information to perform several important optimizations like 
constant and copy propagation [13], common sub-expression elimination, in- 
variant code motion [2,11], induction variable elimination, branch elimination, 
branch fusion, and loop jamming [8]. Program verification tools use these equiv- 
alences to discover loop invariants, and to verify program assertions. This in- 
formation is also important for discovering equivalent computations in different 
programs; this is useful for plagiarism detection tools and translation validation 
tools [10,9], which compare a program with an optimized version in order to 
check the correctness of the optimizer. 

Ghecking equivalence of program expressions is an undecidable problem, even 
when all conditionals are treated as non-deterministic. Most tools, including 
compilers, attempt to only discover equivalences between expressions that are 
computed using the same operator applied to equivalent operands. This form 
of equivalence, where the operators are treated as uninterpreted functions, is 
also called Herhrand equivalence [12]. The process of discovering such restricted 
class of equivalences is often referred to as value numbering. Performing value 
numbering in basic blocks is an easy problem; the challenge is in doing it globally 
for a procedure body. 

Existing deterministic algorithms for global value numbering are either too 
expensive or imprecise. The precise algorithms are based on an early algorithm 
by Kildall [7], which discovers equivalences by performing an abstract interpreta- 
tion [3] over the lattice of Herbrand equivalences. Kildall’s algorithm discovers all 
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Herbrand equivalences in a function body but has exponential cost [12]. On the 
other extreme, there are several polynomial-time algorithms that are complete 
for basic blocks, but are imprecise in the presence of joins and loops in a program. 
The popular partition refinement algorithm proposed by Alpern, Wegman, and 
Zadeck (AWZ) [1] is particularly efficient, however at the price of being signifi- 
cantly less precise than Kildall’s algorithm. The novel idea in AWZ algorithm is 
to represent the values of variables after a join using a fresh selection function 
<j)i, similar to the functions used in the static single assignment form [4], and to 
treat the 4>i functions as additional uninterpreted functions. The AWZ algorithm 
is incomplete because it treats (j) functions as uninterpreted. In an attempt to 
remedy this problem, Riithing, Knoop and Steffen have proposed a polynomial- 
time algorithm (RKS) [12] that alternately applies the AWZ algorithm and some 
rewrite rules for normalization of terms involving <j) functions, until the congru- 
ence classes reach a fixed point. Their algorithm discovers more equivalences than 
the AWZ algorithm, but remains incomplete. The AWZ and the RKS algorithm 
both use a data structure called value graph [8] , which encodes the abstract syn- 
tax of program sub-expressions, and represents equivalences by merging nodes 
that have been discovered to be referring to equivalent expressions. We discuss 
these algorithms in more detail in Section 5. Recently, Gargi has proposed a set 
of balanced algorithms that are efficient, but also incomplete [5]. 

Our algorithm is based on two novel observations. First, it is important to 
make a distinction between “discovering all Herbrand equivalences” vs. “dis- 
covering Herbrand equivalences among program sub-expressions”. The former 
involves discovering Herbrand equivalences among all terms that can be con- 
structed using program variables and uninterpreted functions in the program. 
The latter refers to only those terms that occur syntactically in the program. 
Finding all Herbrand equivalences is attractive not only to answer questions 
about non-program terms, but it also allows a forwards dataflow or abstract 
interpretation based algorithms (e.g. Kildall’s algorithm) to discover all equiva- 
lences among program terms. This is because discovery of an equivalence between 
program terms at some program point may require detecting equivalences among 
non-program terms at a preceding program point. This distinction is important 
because we show (in Section 4) that there is a family of acyclic programs for 
which the set of all Herbrand equivalences requires an exponential sized (in the 
size of the program) value graph representation. On the other hand, we also 
show that Herbrand equivalences among program sub-expressions can always be 
represented using a linear sized value graph. This implies that no algorithm that 
uses value graphs to represent equivalences can discover all Herbrand equiva- 
lences and have polynomial-time complexity at the same time. This observation 
explains why existing polynomial-time algorithms for value numbering are in- 
complete, even for acyclic programs. One of the reasons why Kildall’s algorithm 
is exponential is that it discovers all Herbrand equivalences at each program 
point. 

The above observation not only sheds light on the incompleteness or expo- 
nential complexity of the existing algorithms, but also motivates the design of 
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our algorithm. Our algorithm takes a parameter s and discovers all Herbrand 
equivalences among terms of size at most s in time that grows linearly with s. For 
the purpose of global value numbering, it is sufficient to set the parameter s to 
N, where N is the size of the program, since the size of any program expression 
is at most N. 

The second observation is that the lattice of sets of Herbrand equivalences 
has finite height k, where k is the number of program variables (we prove this in 
Section 3.4). Therefore, an optimistic-style algorithm that performs an abstract 
interpretation over the lattice of Herbrand equivalences will be able to handle 
cyclic programs as precisely as it can handle acyclic programs, and will terminate 
in at most k iterations. Without this observation, one can ensure the termination 
of the algorithm in presence of loops by adding a degree of pessimism. This leads 
to incompleteness in presence of loops, as is the case with the RKS algorithm [12]. 
Instead, our algorithm is based on abstract interpretation, similar to Kildall’s 
algorithm, while using a more sophisticated join operation. We continue with 
a description of the expression language on which the algorithm operates (in 
Section 2), followed by a description of the algorithm itself in Section 3. 

2 Language of Program Expressions 

We consider a language in which the expressions occurring in assignments belong 
to the following simple language of uninterpreted function terms (here x is one 
of the variables, and c is one of the constants): 

e ::= X \ c \ T(ei, 62 ) 

For any expression e, we use the notation Variables (e) to denote the variables 
that occur in expression e. We use size{e) to denote the number of occurrences 
of function symbols in expression e (when expressed as a value graph). For 
simplicity, we consider only one binary uninterpreted function F. Our results 
can be extended easily to languages with any finite number of uninterpreted 
functions of any constant arity. Alternatively, one can encode any finite number 
of uninterpreted functions of constant arity by one binary function symbol with 
only a constant factor increase in the size of the program. 

3 The Global Value Numbering Algorithm 

Our algorithm discovers the set of Herbrand equivalences at any program point 
by performing an abstract interpretation over the lattice of Herbrand equiva- 
lences. We pointed out in the introduction, and we argue further in Section 4, 
that we cannot hope to have a complete and polynomial-time algorithm that 
discovers all Herbrand equivalences implied by a program (using the standard 
value graph based representations) because their representation is worst-case ex- 
ponential in the size of the program. Thus, our algorithm takes a parameter s 
(which is a positive integer) and discovers all equivalences of the form ei = 62 , 
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G 



X := e 



Gi=G 



G,= G 




G’ = jAssignment(G,x := e) | 

(a) Assignment Node (b) Conditional Node (c) Join Node 

Fig. 1. Flowchart nodes 



G = Join(G.|, G 2 s') 



where size(ei) < s and size{e2) < s. The algorithm uses a data structure called 
Strong Equivalence DAG (described in Section 3 . 1 ) to represent the set of equiv- 
alences at any program point. It updates the data structure across each flowchart 
node as shown in Figure 1 . The Assignment and Join functions are described 
in Section 3.2 and Section 3.3 respectively. 

3.1 Notation and Data Structure 

Let T be the set of all program variables, k the total number of program variables, 
and N the size of the program, measured in terms of the number of occurrences 
of function symbol F in the program. 

The algorithm represents the set of equivalences at any program point by a 
data structure that we call Strong Equivalence DAG (SED). An SED is similar 
to a value graph. It is a labeled directed acyclic graph whose nodes can be rep- 
resented by tuples {V, t) where V is a (possibly empty) set of program variables 
labeling the node, and t represents the type of node. The type t is either T or 
c, indicating that the node has no successors, or F{m,n2) indicating that the 
node has two ordered successors n\ and n2- 

In any SED G, for every variable x, there is exactly one node {V, t), denoted 
by Nodea{x), such that x G V. For every type t that is not T, there is at most 
one node with that type. We use the notation Nodecic) to refer to the node 
with type c. For any SED node n, we use the notation Vars{n) to denote the set 
of variables labeling node n, and Type{n) to denote the type of node n. Every 
node n in an SED represents the following set of terms Terms(n), which are all 
known to be equivalent. 

Terms{V, 1 ) = V 
Terms{V,c) = VG {c} 

Terms{V, F{ni,n2)) = V LI {F{ei, 62) \ ei G Ter ms (ni), 62 & Ter ms (n2)} 

We use the notation G |= ci = 62 to denote that G implies the equivalence 
Cl = 62. The judgment G ^ Ci = 62 is deduced as follows. 

G 1 = E(ei,62) = F{e'i,e'2) iff G |= ei = e[ and G ^ 62 = 62 
G|=a: = e iff eG Terms{NodeG{x)) 

The algorithm starts with the following initial SED at the program start, 
which implies only trivial equivalences. 

Go = {(x, T) I a; G T} 
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G4 = Assignment(G3, u := F(F(x,y),x)) G3 = Join(G-|,G2,5) 



Fig. 2 . This figure shows a program and the execution of our algorithm on it. Gi, 
shown in dotted box, represents the SED at program point Li 

In figures showing SEDs, we omit the set delimiters “{” and and rep- 
resent a node ({a;i, . . , a;„}, t) as (xi, . . , t). Figure 2 shows a program and 

the SEDs computed by our algorithm at various points. As an example, note 
that Terms{NodeGi{u)) = {u} LI {F{z, a) \ a G {a;, y}} U {E(F(ai, 02), 03) | 
01,02,0:3 G {x,y}}. Hence, G4 |= tt = F{z,x). Note that an SED represents 
compactly a possibly-exponential number of equivalent terms. 

3.2 The Assignment Operation 

Let G be an SED that represents the Herbrand equivalences before an assignment 
node X := e. The SED that represents the Herbrand equivalences after the 
assignment node can be obtained by using the following algorithm. SED G4 in 
Figure 2 shows an example of the Assignment operation. 

1 Assignment (G, a; := e) = 

2 G' := G; 

3 let = GetNode{G',e) in 

4 let ly2,ti) = Nodec'ix) in 

5 if then G' := G' - {(Vi,ti), (12,^2)}; 

6 G' := G' U {(El U {x},ti), {V2 - {x},t2)}\ 

7 return G' ; 

1 GetNode (G', e) = 

2 match e with 

3 y. return Nodeciy) 

4 c: return Nodecic) ; 

5 E(ei,e2): let n\ = GetNode (G', ei) and rz2 = GetNode (G', 62) in 

6 if (E, F(ni, 712)) G G' for some E, 
then return (E, E(ni, 712)) ; 

else G' := G' U (0, F(t7i, 772)) ; return (0, E(ni, 772)) ; 



7 








A Polynomial-Time Algorithm for Global Value Numbering 217 



GetNode(G', e) returns a node n such that e G T erms{n) (and in the process 
possibly extends G') in 0 {size{e)) time. Lines 5 and 6 in Assignment function 
move variable x to node n to reflect the new equivalence x = e. Hence, the 
following lemma holds. 

Lemma 1 (Soundness and Completeness of Assignment Operation). 

Let G' = Assignment(G, a; := e). Let ei and C2 he two expressions. Let = 
eiWx] and e'^ = C2[%;]- Then, G' \= ei = iff G \= e( = e'^. 

3.3 The Join Operation 

Let Gi and G2 be two SEDs. Let s' be any positive integer. The following 
function Join returns an SED G that represents all equivalences e\ = 62 such 
that both Gi and G2 imply ei = 62 and both szze(ei) and size{e2} are at most 
s' . In order to discover all equivalences among expressions of size at most s in 
the program, we need to choose s' = s + N x k (for reasons explained later in 
Section 3 . 5 ). Figure 2 shows an example of the Join operation. 

For any SED G, let Ag denote a partial order on program variables such 
that X Ag y y depends on x, or more precisely, if G |= y = F{e\, 62) such that 
xG Variables (F(ei, 62)). 

1 JoinCGi , G2 , s') = 

2 for all nodes ni G Gi and ri2 G G2 , memoize[ni,n2] '■= undefined; 

3 G-.= tt>; 

4 for each variable x G T in the order ^Gi do 

5 counter := s' ; 

6 Intersect (AodcGi (a;) , Nodecffx)); 

7 return G; 

1 Intersect ((Vi, ti) , (E2, ^2)) = 

2 let m = memoize{{Vi,ti), {V2,t2)) in 

3 if m undefined then return m; 

4 let t = if counter > 0 and ti=F{ii,ri) and t2 = F{£2,r2) then 

5 counter := counter — 1 ; 

6 let i = Intersect (fi, £2) in 

7 let r = lntersect(ri,r2) in 

8 if and (r yf ((/), J_)) then F(£,r) else J_ 

9 else if = c and £2 = c for some c, then c 

10 else J_ in 

11 let V = Vi n V2 in 

12 if E yf 0 or t yf J_ then G := G U {(E, t)} 

13 memoize[{Vi,ti) , {V2,t2)] := (V,t); 

14 return (E, f) 

It is important for correctness of the algorithm that calls to the Intersect 
function are memoized, as done explicitly in the above pseudo code, since oth- 
erwise the counter variable will be decremented incorrectly. The use of counter 
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variable ensures that the call to Intersect function in Join terminates in 0 {s') 
time. The following proposition describes the property of Intersect function 
that is required to prove the correctness of the Join function (Lemma 2). 

Proposition 1. Let rii = and U2 = (^2,^2) be any nodes in SEDs Gi 

and G2 respectively. Let n = (V,t) = Intersect(ni, 712). Suppose that n yf 
(0, J_); hence the function Intersect(ni, 712) adds the node n to G. Let a be the 
value of the counter variable when Intersect(rii, 712) is first called. Then, 

PI. Terms{n) C T erms{ni) n Terms{n2). 

P 2 . Terms{n)f^{e \ e G Terms{ni), e G Terms{n2), size{e) < a}. 

The proof of Proposition 1 is by induction on sum of height of nodes n\ and 
712 in Gi and G2 respectively. Claim PI is easy since t = F(...) or c only if 
both ti and t2 are F{...) or c respectively (Lines 8 and 9), and V = Vi C V2 
(Line 11). The proof of claim P2 relies on bottom-up processing of one of the 
SEDs, and memoization. Let e' be one of the smallest expressions (in terms of 
size) such that e' € Terms{ni) n Terms{n2) . If e' is not a variable, then for 
any variable y G Variables {e'), the call Intersect(A^odeG^ (y), (y)) has 

already finished. The crucial observation now is that if size{e') < a, then the 
set of recursive calls to Intersect are in 1-1 correspondence with the nodes of 
expression e', and e' G Terms{n). 

Lemma 2 (Soundness and Completeness of Join Operation). Let G = 

Join(Gi, G2, s). // G 1= Cl = 62, then Gi |= ei = 62 and G2 \= e\ = 62. If 
Gi 1= ei = 62 and G2 \= e\ = 62 such that size(ei) < s and size{e2) < s, then 

G 1= 6i = 62. 

The proof of Lemma 2 follows from Proposition 1 and definition of |=. 

3.4 Fixed Point Computation 

The algorithm goes around loops in a program until a fixed point is reached. The 
following theorem implies that the algorithm needs to execute each flowchart 
node at most k times (assuming the standard worklist implementation [8]). 

Theorem 1 (Fixed Point Theorem). The lattice of sets of Herbrand equiv- 
alences (involving program variables) ordered by set inclusion has height at most 
k where k is the number of program variables. 

The proof of Theorem 1 follows easily from Lemma 3 stated and proved 
below. Before stating Lemma 3, we first introduce some notation. Let ^ denote 
any total ordering on all program variables. For notational convenience, we say 
that for any variable x, and any expressions 61 and 62, x =4 F{e\,e2). For any 
SED G, let Iq be the set of variables x such that Type{NodeG{x)) = T, and 
X ^ y for all y G Vars{Node g{x)) . Ig is a maximal set of independent variables, 
which occur at the leaves of G. In other words, equivalences denoted by an SED 
G can be represented by a set of equivalences x = e, where Variables {e) C Ig and 
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X ^ Iq- This is because for any SED G, all equivalences ei = 62 are consequences 
of equivalences of the form x = e. For example, consider the program in Figure 2. 
liu^x^y^z, then Ig^ = {x}. Note that equivalences represented by G4 are 
equivalent to the set of equivalences {y = x, z = F{x, x),u = F{F{x, x),x)}. 

Lemma 3. Let Gi and G2 he two SEDs. //G 2 is above Gi in the lattice (which is 
to say that G\ represents a stronger set of equivalences than G2), then Jgi • 

Proof. We first make two useful observations. Let G be any SED. Then, (a) 
G ^ a; = e such that x G Ig, ^ ^ x and e ^ x. (b) G ^ ei = 62 such that 
Variables(ei) C Ig, Variables{e2) Q Ig and ci ^ 62. 

We first show that Ig^ 2 Igi ■ Suppose for the purpose of contradiction that 
Ig^ 2 Igi- Then, G2 |= a; = e for some variable x G Igi and expression e 
such that e ^ X and e ^ x. Since Gi represents a stronger set of equivalences, 
Gi 1= a; = e. But this is not possible because of observation (a) above. 

We now show that Ig2 D Igi ■ Suppose for the purpose of contradiction that 
Ig2 = Igi- Since Gi is stronger than G2, Gi |= a; = ei for some x G T — Ig^ 
and expression ci such that Variables {ei) C Ig^ and G2 ^ a; = ei. Note that 
X G T — Ig2 since Ica = Igi- Hence, there exists an expression 62 such that 
G2 1= a; = 62, where Variables (62) C 7^2 . Note that ei ^ 62 since G2 ^ a; = ei 
and G2 1= X = 62, Since Gi is stronger than G2, Gi |= x = 62 and hence 
Gi 1= 6i = 62. But this is not possible because of observation (b) above. 

3.5 Correctness of the Algorithm 

The correctness of the algorithm follows from Theorem 2 and Theorem 3. 

Theorem 2 (Soundness Theorem). Let G be the SED computed by the al- 
gorithm at some program point P after fixed point computation. Lf G \= e\ = 62, 
then 6i = 62 holds at program point P. 

The proof of Theorem 2 follows directly from the soundness of the assignment 
operation (Lemma 1 in Section 3.2) and the soundness of the join operation 
(Lemma 2 in Section 3.3). 

Theorem 3 (Completeness Theorem). Let 61 = 62 be an equivalence that 
holds at a program point P such that size(ei) < s and size{e2) < s. Let G he the 
SED computed by the algorithm at program point P after fixed point computation. 
Then, G |= ei = 62. 

The proof of Theorem 3 follows from an invariant maintained by the al- 
gorithm at each program point. For purpose of describing this invariant, we 
hypothetically extend the algorithm to maintain a set S of paths at each pro- 
gram point (representing the set of all paths analyzed by the algorithm), and a 
variable MaxSize (representing the size of the largest expression computed by 
the program along any path in S) besides an SED. These are updated as shown 
in Figure 3. The initial value of MaxSize is chosen to be 0. The initial set of 
paths is chosen to be the singleton set containing an empty path. The algorithm 
maintains the following invariant at each program point. 
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MaxSize = m 
,, Paths = S 

X := e 

MaxSize = m+size(e) 
i Paths = {p;x:=e | p e S} 

(a) Assignment Node 



Lemma 4. Let G be the SED, m be the value of variable MaxSize, and S be 
the set of paths computed by the algorithm at some program point P. Suppose 
ei = 62 holds at program point P along all paths in S, size(ei) < s' — m and 
size{e 2 ) < s' — m. Then, G \= e\ = C 2 - 

Lemma 4 can be easily proved by induction on the number of operations per- 
formed by the algorithm. 

Theorem 1 (the fixed point theorem) requires the algorithm to execute each 
node at most k times. This implies that the value of the variable MaxSize at 
any program point after the fixed point computation is at most N x k. Hence, 
choosing s' = s + N x k enables the algorithm to discover equivalences among 
expressions of size s. The proof of Theorem 3 now follows easily from Lemma 4. 

3.6 Complexity Analysis 

Let j be the number of join points in the program. Let / be the maximum number 
of iterations of any loop performed by the algorithm. (It follows from Theorem 1 
that / is upper bounded by k; however, in practice, this may be a small constant). 
One join operation Join(Gi, G2, s') takes time 0{k x s') = 0{k x {s + N x k)). 
Hence, the total cost of all join operations is 0{k x {s + N x k) x j x I). The 
cost of all assignment operations is 0{N x I). Hence, the total complexity of the 
algorithm is dominated by the cost of the join operations (assuming j > 1). For 
global value numbering, the choice oi s = N suffices, yielding a total complexity 
of 0{k^ X I X N X j) = 0{k^ X N X j) for the algorithm. 

4 Programs with Exponential Sized Value Graph 
Representation for Sets of Herbrand Equivalences 

Let m be any positive integer. In this section, we show that there is an acyclic 
program Pm of size 0{m^) such that any value graph representation of the set of 
Herbrand equivalences that are true at the end of the program requires G(2™) 
size. The program Pm is described in Section 4.2, and is shown in Figure 6. The 
program Pm involves some non-trivial expressions. To describe these expressions, 
and to prove that the set of Herbrand equivalences that are true at the end of 
program Pm requires G(2™) size, we introduce some notation in Section 4.1. 

Let n be the largest integer such that n < m and n is a power of 2. Note 
that n> Y- 




MaxSize = m, MaxSize = m2 
Paths = S.| Paths = S, 



MaxSize = m MaxSize = m 
Paths = S Paths = S 




MaxSize = max(m,,m2) 
Paths = Si u S2 



(b) Conditional Node (c) Join Node 

Fig. 3 . Flowchart nodes 
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A(i,ri,r-,) B(i,R) C(S) 

(a) (b) (c) 

Fig. 4. The value graph representation of expressions A(i,r\,r 2 ), B{i,R) and C{S) 



4.1 Notation 

In this section, we describe some special expressions, sets of expressions, and their 
properties. For any integer z e {1, . . , n} and expressions ri and C 2 , let A{i, ri, r 2 ) 
denote the expression as shown in Figure 4(a). For any integer i G n} 

and sets of expressions ri and r 2 , let A{i,fi,r 2 ) denote the following set of 
expressions: 

A{i,fi,r2) = {A{i,ri,r2) \ n G h,r2 & h} 

For any integer z G {1, . . ,zz} and an array i?[0 . . .2*— 1] of expressions, let 
B{i,R) denote the expression as shown in Figure 4(b). For any integer z G 
{1, . . , zz} and an array .R[0 . . . 2*— 1] of sets of expressions, let B{i, R) denote the 
following set of expressions: 

B{t,R) = {B{t,R) I yj e{0,..,T-l},R[j]GR[j]} 

Using the definitions of A(z,fi,r 2 ) and B{i,R), we can show that 

i(z -b 1, fi, fa) n B{i, R) = B{i + 1, E') (1) 

R'[j] = R[j]nh, 0<j<T 
R'[j] = R[j - T] n fa, r <3 < 

Equation 1 is also illustrated diagrammatically in Figure 5. The point to 
note is that if i?[0], . . , i?[2®— 1] are all distinct sets of expressions, then the most 
succinct value graph representation of B(i,R) is as shown in Figure 5(b). If fi 
and fa are such that for all 0 < ji, ja < 2*, the sets fi n R[ji], f 2 H i?[ja] both 
non-empty and distinct, then the most succinct value graph representation of 
B{i, .R)nA(z-|-l, fi, fa) is as shown in Figure 5(c), whose representation is almost 
double the size of 13{i, R) (even though it has fewer elements!). 

Note that A(l,fi,fa) = B{1,R) where f?[l] = fi and i?[2] = fa. Hence, using 
Equation 1, we can prove by induction on z that: 
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6 6 A A A A A 

r2 R[0]R[1] R[2'-1] r'[0]r'[1] r'[ 2'-1] R'[2'] R'[2i+1-1] 

A(i+1,Ti,r2) n B(i,R) = B(i+1,R') 

(a) (b) (c) 

Fig. 5. Relationship between sets + ri,f 2 ) and B(i, R). Nodes immediately below 
the horizontal dotted line are at the same depth n — {i + 1) from the corresponding 
root nodes 

Proposition 2. For any i G {1, . . ,n}, let r^ i and Vi ^2 be some sets of expres- 
sions. For any integer j, let jn ■ ■ ■ ji he the binary representation of j . Then, 

n 

Pi A{i,r^^l,fi, 2 ) = B{n,R) 

n 

R[j] = f]h,n+1, 0<J<2" 

i=l 

Our goal is to construct a program Pm such that it satisfies the equivalences 
Ei = {z = e I e G G,!, G, 2 ) at the predecessor of some join point. 
Note that after the join point it will satisfy the equivalences E = {z = e \ e G 
B{n, R)}, where B{n, R) is as defined in Proposition 2. The representation of E 
would require size exponential in n if the sets f{i, 1) and f(z, 2) are such that for 

n 

each distinct choice of bits ji, ■ ■ ,jn, the set P| is distinct and non-empty. 

i—1 

This can be easily accomplished if the program has 2” variables (by choosing sets 
f(i, 1) and r(i,2) to contain an appropriate subset of the 2" program variables 

n 

such that fj Gji-i-i is a singleton set containing a distinct program variable). In 

Z— 1 

the rest of this section, we show how to accomplish this using just n program 
variables (by choosing sets r{i, 1) and f(z,2) to contain small terms constructed 
from just n program variables). 

For any array S'[0 . .2n— 1] of expressions, let C{S) denote the expression as 
shown in Figure 4(c). For any array .F[0 . . 2n— 1] of sets of expressions, let C{S) 
denote the following set of expressions: 

C{S) = {C'(S') I Vz G {0, . . , 2n - 1}, G ^[z]} 
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For any integer i G n}, 6 G {1, 2}, let S'i,b[0 . . 2n— 1] be the following 

array of expressions, 



Si,b[j] = 1) a j = 2{i - 1) + b- 1 

= 0, otherwise 

For any integer i G {1, ■ • , n}, b G {1, 2}, let Si^t[0 ■ ■ 2n— 1] be the following 
array of sets of expressions, 

= {2^0 1}. if j = 2(t - 1) -b 6 - 1 

= {xi, . . . . ,x„, 0}, otherwise 

For any integer j G {0, . . , 2”— 1}, let . . ji be the binary representation of 
j. Let Tj[0 . . 2n— 1] be the following array of expressions: 

Tj[2{£ - 1) -b ji] = Xi, 0 < £ < n 
Tj[2{£ - 1) -b 1 - jf] = 0, 0 < ^ < n 

Using the definitions of U(5), Si^b and Tj), we can prove the following propo- 
sition. 

Proposition 3. Let j G {0, . . , 2” — 1}. Let . . ji be the binary representation 
ofj. Then, 



f| = {C{T,)} 

i=l 

Note that C{SiA are an appropriate choice for sets The following proposi- 
tion, which follows from Proposition 2 and Proposition 3, summarizes the inter- 
esting property of these sets. 

Proposition 4. For any i G {1, . . , n}, Then, 

n 

f| A{i, C{A,2)) = {B{n, R)} 

R[j]=C{T,), 0<j<2" 



4.2 The Program Pm 

The program Pm, which contains an n-branch switch statement, is shown in 
Figure 6. It consists of n -b 1 local variables, 2 ;, xi, X 2 , . . , x„. The expressions Ui 
and b are defined below. 



ai = A{i,C{S,A,C{SiA) 

b = B(n, R) 

R\j] = C{TA, 0 < J < 2" 
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Fig. 6. The program Pm- The expressions ai and b are defined in Section 4.2 

Note that for alH S {1, . . , n}, size{ai) < 6n. Thus, the size of program Pm 
is 0{n^) = We now show that any value graph representation of the set 

of equivalences that hold at the end of the program Pm requires 0{2™) nodes. 
First note that it is important to maintain only equivalences of the form x = e 
where a: is a variable and e an expression. (This also follows from the fact that 
the SED data structure that we introduce in Section 3.1 can represent the set of 
equivalences at any program point). The following theorem implies that there is 
only one such equivalence, namely z = b, that holds at the end of program Pm- 

Theorem 4. Let E denote the set of all Herhrand equivalences of the form x = e 
that are true at the end of the program Pm - Then, 

E = {z = b} 

Proof Let Ei denote the set of all Herbrand equivalences of the form x = e that 
are true at point Lj in the program Pm- Then it is not difficult to see that: 

E, = {z = e I e e T(i,C'(5j_i),C'(S'j_2))} U {a:j = l} U {xj=0 | 1 < j <n,j^i} 

Using Proposition 4 we get: 

n n 

E=f]E, = {z = e\ eG f|i(i,C(5i,i),C(5,,2))} 

i=l i=l 

= {z = e I e G {6}} = {z = b} 

Note that any value graph representation of expression b must have size 
0(2”) since R[ji] yf R[j 2 ] for ji yf j 2 - Hence, any value graph representation of 
the equivalence z = b requires 0(2”) = 0(2*”) nodes. 

5 Related Work 

Kildall’s Algorithm: Kildall’s algorithm [7] performs an abstract interpretation 
over the lattice of sets of Herbrand equivalences. It represents the set of Herbrand 
equivalences at each program point by means of a structured partition. 
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The join operation for two structured partitions tti and 7T2 is defined to be 
their intersection. Kildall’s algorithm is complete in the sense that if it termi- 
nates, then the structured partition at any program point reflects all Herbrand 
equivalences that are true at that point. However, the complexity of Kildall’s 
algorithm is exponential. The number of elements in a partition, and the size of 
each element in a partition can all be exponential in the number of join opera- 
tions performed. 

Alpern, Wegman and Zadeck’s (AWZ) Algorithm: The AWZ algorithm [1] works 
on the value graph representation [8] of a program that has been converted to 
SSA form. A value graph can be represented by a collection of nodes of the 
form (y, t) where V is a set of variables, and the type t is either T, a constant c 
(indicating that the node has no successors), F{ni,n2) or 4>m{ni,n2) (indicating 
that the node has two ordered successors n\ and 712). (j)m denotes the (j) function 
associated with the join point in the program. Our data structure SED can 
be regarded as a special form of a value graph which is acyclic and has no (j)-type 
nodes. The main step in the AWZ algorithm is to use congruence partitioning 
to merge some nodes of the value graph. 

The AWZ algorithm cannot discover all equivalences among program terms. 
This is because it treats (j) functions as uninterpreted. The (j) functions are an 
abstraction of the if-then-else operator wherein the conditional in the if-then- 
else expression is abstracted away, but the two possible values of the if-then- 
else expression are retained. Hence, the 4 > functions satisfy the following two 
equations. 



Ve : 4 >m{e,e) = e (2) 

Vei,e2,C3,e4 : <)im(A(ei, 62), F(c3, 64)) = E((/)^(ei, 63), (^™(e2, 64)) (3) 

Riithing, Knoop and Steffen’s (RKS) Algorithm: Like the AWZ algorithm, the 
RKS algorithm [12] also works on the value graph representation of a program 
that has been converted to SSA form. It tries to capture the semantics of 4 > 
functions by applying the following rewrite rules, which are based on equations 
2 and 3, to convert program expressions into some normal form. 

(y,<t)m{n,n)) dmA n ly id Vars{n),Type{n)) (4) 

{Vyy{Vi, F{m,n2)) , (^ 2 ,^( 743 , 774 )))) 

^ (y, E((0, ())m(77i,773)), (0, ())m(772,n4)))) (5) 

Nodes on the left of the rewrite rules are replaced by the (new) node on the 
right, and incoming edges to nodes on the left are made to point to the new 
node. However, there is a precondition to applying the second rewriting rule. 

P : V nodes n G SMCc*({(yi, F(t 7 i, 772)), (y2, ^(773, 774))}), Vars{n) 0 

The RKS algorithm assumes that all assignments are of the form x := F{y,z) 
to make sure that for all original nodes 77 in the value graph, Vars(n) 0. This 
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precondition is necessary in arguing termination for this system of rewrite rules, 
and proving the polynomial complexity bound. The RKS algorithm alternately 
applies the AWZ algorithm and the two rewrite rules until the value graph 
reaches a fixed point. Thus, the RKS algorithm discovers more equivalences 
than the AWZ algorithm. 

The RKS algorithm cannot discover all equivalences even in acyclic pro- 
grams. This is because the precondition P can prevent two equal expressions 
from reaching the same normal form. On the other hand lifting precondition 
P may result in the creation of an exponential number of new nodes, and an 
exponential number of applications of the rewrite rules. Such would be the case 
when, for example, the RKS algorithm is applied to the program Pm described 
in Section 4. 

The RKS algorithm has another problem, which the authors have identified. 
It fails to discover all equivalences in cyclic programs, even if the precondition 
P is lifted. This is because the graph rewrite rules add a degree of pessimism to 
the iteration process. While congruence partitioning is optimistic, it relies on the 
result of the graph transformations which are pessimistic, as they are applied 
outside of the fixed point iteration process. 

Gulwani and Necula’ s Randomized Algorithm: Recently, we gave a randomized 
polynomial-time algorithm that discovers all Herbrand equivalences among pro- 
gram terms [6]. This algorithm can also verify all Herbrand equivalences that 
are true at any point in a program. However, there is a small probability (over 
the choice of the random numbers chosen by the algorithm) that this algorithm 
deduces false equivalences. This algorithm is based on the idea of random inter- 
pretation, which involves performing abstract interpretation using randomized 
data structures and algorithms. 

6 Conclusion and Future Work 

We have given a polynomial-time algorithm for global value numbering. We have 
shown that there are programs for which the set of all equivalences contains 
terms whose value graph representation requires exponential size. This justifies 
the design of our algorithm, which discovers all equivalences among terms of size 
at most s in time that grows linearly with s. 

An interesting theoretical question is to figure if there exist representations 
that may avoid the exponential lower bound for representing the set of all Her- 
brand equivalences. 

The next step is to perform experiments to compare the different algorithms 
with regard to running time and number of equivalences discovered. Results of 
our algorithm can also be used as a benchmark to estimate the incompleteness 
of the existing algorithms. 

An interesting direction of future work is to extend this algorithm to perform 
precise inter-procedural value numbering. It would also be useful to extend the 
algorithm to reason about some properties of program operators like commuta- 
tivity, associativity or both. 
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Abstract. This paper presents a static analysis that computes quantitative infor- 
mation for recursive heap structures in programs with destructive updates. The 
algorithm targets tree structures and is ahle to extract quantitative information 
about the height and the balancing of such structures. We formulate the algo- 
rithm as a dataflow analysis. We use a heap abstraction that captures both shape 
invariants and quantities of heap structures. Then, we give a precise specification 
of the transfer functions that describe how each statement updates this abstrac- 
tion. The algorithm is able to verify the correctness of re-balancing operations 
after AVL tree insertions. 



1 Introduction 

Dynamic data structures represent fundamental constructs in virtually all programming 
languages. To check or enforce the correctness of programs that manipulate such struc- 
tures, the compiler must automatically extract invariants that describe their shapes. For 
programs with destructive updates, the task of identifying the invariants is signihcantly 
more complex because such programs temporarily invalidate them. Examples include 
even simple operations such as inserting or removing elements from a list. The chal- 
lenge is to identify that the invariants hold after the execution of destructive operations. 

In the past decades, researchers have developed a number of shape analysis algo- 
rithms that identify various properties of the heap, including aliasing, sharing, cyclicity, 
or reachability [18]. Such properties allow the compiler to distinguish between trees 
and arbitrary graphs, or between cyclic and acyclic lists. However, little work has been 
done to characterize quantitative information of heap structures, such as the height, the 
balancing, or the number of nodes or edges in such structures. 

This paper presents a quantitative shape analysis algorithm that is able to verify in- 
variants about the balancing of tree structures. To the best of our knowledge, none of 
the existing techniques are able to check such invariants for programs with destructive 
updates. The difficulty of quantitative shape analysis lies in the fact that it combines 
two complex analysis problems; shape analysis, where the compiler must statically rea- 
son about unbounded numbers of heap locations; and quantitative symbolic analysis, 
where the compiler must reason about numbers, symbolic quantities, and arithmetic. 
This paper proposes a solution to this problem using dataflow analysis. 

Our algorithm uses a heap abstraction that combines shape information and quan- 
titative information, and computes both of them simultaneously. Computing shape is 
required because quantitative properties may depend on the shape of the structure. For 
example, one can talk about height or balancing only for non-cyclic structures. To pro- 
vide a unihed framework, our analysis captures shape information simply as another 

R. Giacobazzi (Ed.): SAS 2004, LNCS 3148, pp. 228-245, 2004. 
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quantity: the reference count, which represents the number of incoming pointers to a 
given heap location. 

The abstraction in our analysis consists of a points-to graph (whose nodes model 
heap locations), augmented with quantitative information. The analysis captures quan- 
tities in three forms. First, it keeps track of quantitative attributes for nodes, such as 
the reference count n.c of a node n, its height n.h, and its skew n.s, defined as the 
height difference between its children. Second, it keeps track of quantitative predicates 
(or invariants), which describe the desired properties and are expressed in terms of the 
quantitative attributes. For instance, the balancing predicate is B(n) = (— 1 < n.s < 1) 
and the tree-ness (or unaliasing) predicate is U{n) = {n.c < 1). Finally, the analysis 
keeps track of auxiliary relations that describe key pieces of information not directly 
captured by predicates. For our algorithm, these are relations that describe the height 
difference between nodes which don’t necessarily have the same parents. 

As in the case of other shape analyses for languages with destructive updates, the 
main difficulty is that the shape and the quantitative invariants (i.e., the predicates) may 
be temporarily broken during updating operations. To solve this problem, the analysis 
“zooms in” on the particular heap locations that are primarily involved in the destructive 
update, keeping track of numeric values for their attributes and keeping track of aux- 
iliary relations. This information enables the analysis to determine that the invariants 
hold again after the update operations. This is the standard approach taken by state- 
of-the-art shape analyses [16, 17, 9]. Our contribution is to show how the compiler can 
achieve this goal when reasoning about heap quantities. 

Unlike standard points-to and shape analyses, our algorithm focuses just on accu- 
rate results: when the information in the abstraction becomes somewhat imprecise, the 
analysis turns it into a top value. Continuing the analysis at that point would likely 
yield results that are too imprecise to be meaningful; in contrast, giving up and using 
a top value substantially simplifies parts of the algorithm. This situation happens, for 
instance, when the program traverses the field of a structure, but the analysis cannot 
determine that the resulting location is unaliased. This is also reflected in the fact that 
the analysis computes “must” points-to information, where each node may point to at 
most one other node. 

The list below summarizes the key properties of our algorithm: 

1 . It simultaneously computes points-to information, shape and quantitative informa- 
tion in a unified framework. Shape is captured via the reference count quantity; 

2. It subsumes shape analysis. In particular, it can successfully handle the canonical 
in-place list reversal example [16, 17]; 

3. It keeps track of numeric quantities and auxiliary relations to be able to determine 
that the desired quantitative invariants hold after destructive operations; 

4. It it pessimistic: once the accuracy of the analysis information degrades up to a 
certain level, the analysis gives up and returns a top element. 

The remainder of the paper is organized as follows. Section 2 presents an example. 
Section 3 presents the shape analysis algorithm and shows how the algorithm works for 
the example. Finally, Section 4 discusses related work and we present future research 
directions in Section 5. 
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1: y = x.left; 

2 : if (y . val == 1 ) { 

3 : X . val = 0 ; 

4: y.val = 0 ; 

5: z = y. right; 

6: y. right = x; 

7 : X . lef t = z ; 

8: } 

Fig. 1. Example: one of the eight possible cases for re-balancing AVL trees after insertion. Ini- 
tially, the tree rooted at x has a skew of 2. The field val contains the skew value. 



2 Example 

Figure 1 presents an example program that our analysis is designed to handle. This is a 
fragment of the code that re-balances AVL trees after insertions. We use the following 
terms to define AVL trees. The skew (or skewing factor) of a tree is the difference 
between the height of its left and its right child. A tree is balanced if it has a skew 
between - 1 and 1 . Then, an AVL tree is a binary search tree such that, for each node in 
the tree, the subtree rooted at that node is balanced. 

An insertion into an AVL tree starts by inserting the new element as a leaf in the tree. 
As a result, many subtrees may become no longer balanced. The insertion process walks 
up the tree from the leaf to the root, and re-balances internal nodes using rotations. At 
each point during this bottom-up traversal, it examines the current node. Depending on 
the skew of this node and its children, the program must examine eight possible cases; 
some of them require a single rotation, others require two rotations. 

The code from Figure 1 presents the code for one of these eight cases. This code is 
written in a Java-like language. Each heap cell contains four fields: left and right 
are the children nodes; val is an integer field that represents the skew of the current 
node; and data is the data field. For the purpose of re-balancing, the data field is 
irrelevant and doesn’t show up in this code fragment. Variable x points to the currently 
analyzed node; node y is the left child of x and z is the right child of y. The case being 
considered here is that where x is imbalanced to the left, with a skew of 2, and y is 
balanced, but skewed to the left with a skew of 1 . In this case, the program updates the 
skew factors for x and y (at lines 3 and 4), and then performs a rotation to the left (lines 
5, 6 and 7). 

We want to prove the following property. If, before line 1, we know that: 

1 . the structure rooted at x is a tree; and 

2. the skew of x is 2; and 

3. all nodes other than x are balanced and the values in their val fields correctly 
describe their skews. 

then, after this code fragment (after line 7), the following conditions hold: 

1. the structure rooted at y is a tree; and 

2. all nodes are balanced and the val fields correctly describe their skews. 
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Fig. 2. Points-to and quantitative information at three program points during the rotation. 

2.1 Proving the Property 

The program looks very simple: it contains six assignments, one if statement, and no 
loops. However, checking the desired property is a non-trivial task: it requires reasoning 
about the skews and heights of various subtrees, and performing arithmetic on these 
quantities. 

Figure 2 describes the key pieces of information at selected points in the program, 
after lines 5, 6 and 7. The graphs in these figures encode points-to information: nodes 
labeled x, y, and z represent the heap locations that each of these variables point to; 
and the edges represent points-to relations between them. The nodes and edges showed 
with dashed lines and circles are not being accessed by the program, but they play an 
important role in proving the property. These nodes, y° and z°, represent the siblings 
of y and z; in the rest of the paper, we refer to such nodes as the duals of y and z. 

For each of the selected program points. Figure 2 shows the quantitative informa- 
tion for nodes x and y. For each node n, we show the following: the attribute n.v, 
representing the value stored in its val field; the attribute n.s, representing its skew; 
the balancing predicate B{n), which shows whether or not n is balanced; and the skew- 
ing consistency predicate S{n), which indicates whether or not n.v and n.s are the 
same. If an attribute or predicate is not shown for x or y, it means that its precise value 
cannot be determined statically. Predicates B and S hold for all of the heap locations 
not represented by nodes in these graphs. 

Figure 2(a) shows the following information after the assignment at line 5 : x has 
skew value 2, because of the assumption at the beginning of the program; attributes x.v 
and y.v are each 0 because of the assignments at lines 3 and 4; B{y) shows that y is 
balanced; and y has a skew of 1, because of the test condition at line 2 and the fact that 
S{y) was true at that point. However, several of the facts that we want to prove do not 
hold at this point: x is not balanced, S{x) does not hold, and neither does S{y). 

The situation at the next program point gets even worse: Figure 2(b) shows that the 
assignment y . right=x makes the structure become cyclic, with a cycle between x 
and y. We can no longer talk about skewing or balancing for these nodes. Therefore, all 
of x.s, y.s, and B{y) become undefined. 

However, Figure 2(c) shows that we can recover all of the desired B and S predi- 
cates for X and y after the assignment x . lef t=z (even though none of them held at 
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the previous program point). The key piece of information that enables us to do so is 
the relation between the relative heights of nodes y° , z, and z°\ this information is not 
explicitly shown in Figure 2. Let us denote by n.h the height attribute of n. We can use 
the following reasoning. At the program point for Figure 2(a), the skewing factors of y 
and X are y.s = 1 and x.s = 2, so: 



z° .h = z.h + 1 (1) 

y.h = y°.h + 2 (2) 

Further, because y is skewed to the left, its height is 1 plus the height of its left child: 

y.h = z°.h + 1 (3) 



Now observe that the following two assignments (y. right=x and x. lef t=z) up- 
date X and y, but leave the structures rooted at y° , z, and z° unchanged. We therefore 
keep equation (1) and eliminate attribute of y from equations (2) and (3): 



These facts imply that: 



z° .h = z.h + 1 
y° .h + 2 = z° .h + 1 

z.h = y° .h 



(4) 

(5) 

(6) 



Because the structures rooted at y° , z, and z° remain unchanged, relations (4), (5), and 
(6) hold at all program points. Hence, we can use these equations to prove the desired 
invariants at the end of the program. Given the points-to relations from Figure 2(c), the 
skew of a; is: 

x.s = z.h — y° .h = 0 (7) 



The fact that z.h = y°.h (from relation (6)) also implies that the height of x is equal to 
z.h+l, since both of its children have the same height. Given the points-to relations of 
y after the update, its skew is: 



y.s = z° .h — x.h = z° .h — {z.h+ 1) = 0 



( 8 ) 



Finally, we can compare x.s to x.v, 1, and -1, and y.s to y.v, 1, and -1, to conclude that 
all of B{x), S{x), B{y), and S{y) hold again at the end of the true branch. 



2.2 Shape Information 

The argument above implicitly assumed that all of the nodes in the points-to graphs 
from Figure 2 are distinct, and that traversing edges via the assignments y = x . next 
or z = y . right does not create back edges from y to x, or from z to y. All of these 
facts hold because the original structure is a tree. Hence, shape information is critical 
and reasoning about quantities requires reasoning about shapes as well. 

One easy way to incorporate shape information in our framework is to add one 
more quantitative attribute that describes shapes: the reference count. Given a node n. 
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the reference count attribute n.c represents the number of incoming pointers from other 
heap locations. A structure is a tree if its root has a reference count 0, and all of the 
nodes reachable from the root have reference counts of 1 . 

Similarly to the other quantitative invariants, the tree invariant is temporarily broken 
in our example. After the assignment y . r ight=x the structure becomes cyclic: at that 
point, reference counts of x and y are both 1, so none of them can be the root of the 
tree. However, the assignment x . lef t=z changes the count of z to 1, and that of y to 
zero, thus making the structure be a tree rooted at y. Hence, shape is just a particular 
quantitative invariant and quantitative analysis generalizes shape analysis. 

Note that swapping assignments in lines 6 and 7 produces the same final result, 
but a different intermediate shape state: in that case there is no cycle, but node z will 
temporarily have a reference count of 2. That count will decrease to 1 after the last 
assignment. 

2.3 Required Analysis Information 

To be able to prove the desired property for the example discussed in this section, an 
analysis must keep track of the following pieces of information: 

1 . Points-to information: it must distinguish the heap locations that program variables 
reference and must identify points-to relations between those locations; 

2. Shape information: it must keep track of reference counts and it must be able to 
precisely identify when cyclic structures and heap locations with multiple incoming 
references temporarily arise in the program; 

3. Quantitative information: it must keep track of numeric values for quantities, of 
invariants between quantitites, and of key auxiliary relations that can enable the 
recovery of the invariants after destructive updates. 

3 Algorithm 

In this section we describe the proposed quantitative shape analysis algorithm in detail. 
We consider programs which manipulate recursive heap structures. Each structure has a 
set of held selectors. These helds include pointer helds, which hold references to other 
structures, and integer helds, which hold integer quantities relevant to the invariants 
that the analysis wants to extract. Our algorithm targets tree structures, hence there are 
exactly two pointer helds; also, each structure holds one integer held which models 
skewing. Let Vp be the set of program variables, Ff = {fi, f 2} the set of pointer helds 
(where /i is the left selector and /2 is the right selector held), and Fy = {u} the set of 
integer helds in our case. 

We assume a program representation consisting of a control-how graph. Each node 
in the graph is a statement of one of the following forms: 

X = null x = new x = y 

X = y.f x.f = y x.v = c if {x.v == c) 

where x^y G Vp are program variables, f G Ff is a pointer held, v is the (unique) 
integer held, and c G Z ranges over integer constants. We use a dot notation for held 
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accesses, for instance x.f, y.f, or x.v (node attributes use a similar notation, e.g., n.s or 
n.h, but one can distinguish between them because we use different letters for variables 
and nodes). The above assignments include dynamic allocations (x = new), nullifica- 
tion assignments (x = null), copy, load, and store assignments of heap references, as 
well as assignments of constants to integer fields of structures. Furthermore, the analy- 
sis uses the information in conditional tests that compare integer helds against constant 
values: if (x.v == c). Without loss of generality, we assume that the variables in each 
load and store assignment are distinct; and that each non-null assignment a; = ... is pre- 
ceded by a: = null, if there is a non-null definition of x that reaches the assignment. 

3.1 Concrete Heaps and Properties 

The concrete heap consists of a special null location and a set L of heap locations, 
which form a tree structure. We inductively define the quantities height(l) and skew(l) 
for the substructure rooted at location I using the standard inductive definitions: 

Height: 1. height = max(height(l.fi), height(l.f 2 )) + lifl^ null 
l.height = 0 if I = null 

Skew: l.skew = heightfl.ff) — heightfl.ff) if I null 

l.skew = 0 ifl = null 

If the structure rooted at I contains cycles, then these equations do not have solutions 
over natural numbers Z. In that case, the height and skew have an unknown value: 
height(l) = skew(l) = unk. Hence, these quantities take values over Z„ = Z U {unk]. 

If val(l) € Z is the integer field of location I, and refcountfl) € N is the reference 
count of /, the goal of the analysis is to determine that the following properties hold at 
the end of the program for all locations I G L: 

Balancing : — 1 < skew(l) < 1 

Consistent skews : val(l) = skew(l) 

Unaliasing : refcount(l) < 1 



3.2 Abstraction 

We define the heap abstraction as follows. For each variable x G Vp, there is a dual 
variable x° . Whenever a variable x is being assigned held fi of some other variable y, 
the analysis automatically assigns x° to held /2 of y. Similarly, when x traverses held 
/ 2 , x° traverses held /i. Let V° be the set of dual variables: V° = {x° \ x G Vp} and 
V the set of all variables: V = Vp\JV° . 

The analysis uses four attributes: s is the skew of a node, v is the value stored in 
the integer held, c is the reference count, and h is the height of a node. These abstract 
attributes model the concrete quantities skewfl), val(l), refcount(l), and height(l) for lo- 
cations I G L. We classify the attributes into a set of main attributes Am = {s, v, c} and 
a set of auxiliary attributes Aa = {h}. The values of main attributes directly charac- 
terize the quantitative invariants; auxiliary attributes provide indirect information about 
the invariants, via the auxiliary relations. 
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The abstract information is a pair I = {G, Q) consisting of an abstract heap G and 
an abstract quantitative information Q. The heap is a pair G = {N, E), where: 

- TV C 7^(y) is a set of abstract nodes. A node n C V models the heap location that 
is pointed to exactly by the variables in n. There is a distinguished node Us = % 
representing the summary node, which models all of the heap locations not pointed 
to by any variables. We denote by Ni the set of all individual nodes, i.e., Ni = 
N - {ns}. 

- E : N ^ ((TV U l-L, T|) x (TV U {_L, T })) describes points-to relations between 
nodes. If n,p,q G TV and E{n) = {p, q), then p and q are the children of n. If 
p or g is _L, then the corresponding child is null; if one of them is T, then the 
corresponding child is not precisely known; and if p or g is a node in TV, then the 
corresponding child is either null, or points to the location that p or g represents. 
Note that each field can point to at most one other node, hence this can be regarded 
as “must” information. 

The abstract quantitative information is a triple Q = (A, P, R), where: 

- A : (TVj X Am) ^ (Z U (T }) provides numeric values for the main attributes of 
individual nodes. A top value T indicates that a precise numeric value is not known. 
We write n.a as an alternate notation for A{n, a). 

- P is a set of predicates (or invariants) for all nodes, including the summary node. 
For this analysis, P : TV ^ (where T = (0, 1} is the two-valued logic) de- 
scribes, for each node n, the truth values of the following three predicates: 

B{n) = (—1 < n.s < 1) 

S{n) = (n.s = n.v) 

U{n) = (n.c < 1) 

These correspond to the balancing, consistent skewing, and unaliasing properties. 
A value of 1 (true) indicates that the invariant holds; a value of 0 (false) indicates 
that it is unknown whether or not the invariant holds. For the summary node, a 
value of 1 indicates that the invariant holds for all of the heap locations that the 
node models. We denote by Pfc(n), with k = 1..3, the projection of P(n) on the 
fc-th component (i.e. Pi is B, P 2 is S, and P 3 is U). 

-Pisa set of auxiliary relations between the auxiliary attributes of the individual 
nodes TV^. For our analysis, P consists of linear equations between height attributes: 
n.h = n'.h + c. We use a canonical representation of these equations in the form 
of a table P : ((TV^ U _L) x Aa)^ ^ (Z U (Tj). Each entry R{n.h,n' .h) = c 
models a relation n.h = n'.h + c. Again, _L represents null values and T models 
unknown values. Despite the fact that the table may contain redundant information, 
we use this model to simplify the presentation of the analysis. A more compact 
representation can keep track only of linearly independent relations R{n.h, n'.h) 7 ^ 
T. 

Besides the pairs (G, Q), we define a top information T j, which models imprecise 
information about all components. The analysis uses T i to give up when it determines 
that the information has already become inaccurate and it is unlikely that it will become 
accurate again. Being able to do so simplifies parts of the analysis. There is also a 
bottom information J_i, used as initial value at all program points. 
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3.3 Consistency of Heap Abstraction 

The key fact that enables the analysis to recover the quantitative information after de- 
structive operations is that it maintains consistency relations between the skew of each 
node and the height difference of its children. More precisely, consider nodes n G Ni, 
p G Ni U {-L}, and q G Ni U {_L}, where at least one of p and q is not null. If 
E{n) = {p, q), the abstraction is such that: 



n.s = c G Z 


R{p.h, q.h) = cGIj 


(9) 


In that case, the analysis guarantees that: 


c> 0 = 


> R{n.h,p.h) = 1 


(10) 


c< 0 = 


R{n.h, q.h) = 1 


(11) 


n.v = c = 


^ S'(n) = 1 


(12) 


-1 < c < 1 = 


^ B{n) = 1 


(13) 



At load statements, the analysis uses the left-to-right implication in equation (9) to de- 
rive new auxiliary relations. After destructive updates via store assignments, the analy- 
sis uses the right-to-left implication in (9) to recover attribute values and predicates. 

3.4 Notations 

We use the following notations. Given a set of nodes N, we denote by nodes{x) all of 
the nodes that contain x\ nodes{x) = {n\nGN/\xG n}. Given I = {{N,E), 
(A,P,R)), a node n G Ni and another node n' G V{V), we write I[n'/n] for the 
substitution of node n with node n' into I. If n' ^ N, the substitution replaces n with 
n' in N and in the domains of E, A, P, and R. If n' € N , it removes node n from 
N and from the domains of E, A, P, R, and merges the information of n into that of 
n' using the merge operation defined in Section 3.6 (if n' = Ug, it only merges the 
information of E and P). 

Given a map m, then m\x v] extends the map with a new value for x, if x ^ 
(ioTO(m), or updates the map with a new value force, if X G (iom(m). We write m[xi ^ 
vi].\xk Vk] to denote a sequence of updates (or extensions). If S' is a set, then 
m\x v]a;GS updates (or extends) the map for all elements in S. Finally, if 6 is a 
condition, then m[b ^ x v] conditionally updates (or extends) the map m with a 
new value for x, if b is true. 

3.5 Analysis of Statements 

This section presents in detail the transfer functions for statements in the program. 
Given an abstract information / before statement s, we describe the resulting infor- 
mation I' after the statement. Test conditions if (...) yield two outcomes: a new piece 
of information /' on the true branch, and unchanged information /" = / on the false 
branch. 

If the input information is either top I = T ^ or bottom / = _Lj, then the resulting 
information is the same I' = I. Otherwise, if / = {{N, E), {A, P, R)), the analysis 
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computes which is either a tuple I' = {{N' , E'), {A' , P' , R')), or is top I' = T i. 
The cases below show how the analysis computes /' for each statement in the program. 

Case X = null For nullification statements, the analysis nullifies both vari- 
able X and its dual x°. The resulting information is /' = I[n'i/rii] . . . [n'^/uk], where 
rii, . . . ,Uk represents an ordered sequence of the nodes in the set nodes{x) U nodes{x ° ) 
and n[ = rii — {x, x°} for all i = l..k. 

Case \x = y\ The resulting information is /' = where 

m, . . . , rzfe is an ordered sequence of the set nodes{y) and n' = U {a;}. In particular, 
if nodes{y) = 0, then /' = I. 

Case I X = new] The resulting information I' = {{N' , E'), {A', P', R')) is: 

N' = NU{{x}} 

E' = E[{x}^(E,E)] 

A' = A[ ({x}, s) 1 -^ 0, ({x}, v) ^ 0, ({x}, c) 1 -^ 0 ] 

In other words, the allocation produces a node with null children (E), with zero at- 
tributes (A), and with true predicates (P). For R', the analysis sets all of the relations of 
{x} to T, except R{{x}.h, E.h) = 1 and R{±.h, {x}.h) = — 1. 

Case X = y.f Assume that / is the first field: / = /i (the other case is symmet- 
ric). For each such assignment, the algorithm manufactures an additional load for the 
dual, on the opposite field: x° = y.f 2 - F then analyzes the sequence of the two loads. 
Their analysis is similar, so we present just the processing of x = y.f. 

Consider that nodes{y) 0 (otherwise, y is null and this assignment is guaranteed 
to produce a run-time error). The analysis examines the set nodes{y). For each node 
Uy € nodes(y), the analysis traverses the field / of Uy and produces a new information 
ly. After analyzing all of the nodes that contain y, the algorithm merges all of the 
resulting pieces of information /' to derive the new information /' after the assignment. 

We present how the algorithm computes each /' by analyzing node Uy € nodes(y). 
The algorithm inspects the points-to relations of Uy and considers the following cases: 

- If E(uy) = (-L, *), where is a wildcard symbol, then ly = I (since the field is 
null). 

- If E{uy) = (T, *), then I' = (the analysis is aborted, because it is unknown 
where the left field of y points to). 

- If E{ny) = (n, *), where n G Ni, then ly = I[{n U {x})/n]. 

- If E{ny) = {us, *) and U (us) = 0, then /' = (the analysis is aborted, because 

the locations in the summary node may be aliased). 

- If E{ny) = {ris, *) and U{ns) = 1, then the analysis creates a new node Ux = {x} 
via materialization [15,16]. Let mGA^U{_L,T} such that E{ny) = {ns,u). The 
resulting components of /' = {{N', E'), {A', P', R')), except R, are: 
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TV' = TV U {n J 

E' = E[uy^ {rix,u),nx ^ E{ns)] 

A' = A[ (n^, s) 1 -^ T, (n^;, v) T, (n^;, c) 1 ] 

p' = p [n^ ^ P(ns) ] 

Note that if m = Ug, then the dual of x is also being materialized by the dual load 
statement. In other words, the analysis performs a “double materialization” in that 
case. 

For R, the analysis enforces the consistency of the abstraction. It inspects the value 
of u and the skew of Uy and determines the following relations: 

if Uy.s = cGZ a uGTVjU {_L} then R{nx-h, u.h) = c 
if Uy.s = c > 0 then R{ny.h,rix-h) = 1 

The analysis adds all of these new relations to i?'. Then, it computes the closure of 
the resulting table R': it deduces other relations for Ux-h using the relations of Uy. 
Finally, it sets all of the remaining relations of ria;.Ti to T. 

Case x.f = y This statement destructively updates the heap. As a result, many of 
the existing attributes, quantities, and relations become no longer valid. The algorithm 
analyzes such a statement in two steps: it hrst invalidates all of the quantitative informa- 
tion for nodes that have been affected by the update; then, it tries to derive quantitative 
information for the new structure, based on the available information in the auxiliary 
relations. 

As in the case of load statements, the sets nodes{x) and nodes{y) may contain zero, 
one, or more nodes. If nodes{x) is empty, that corresponds to a definite run-time error. 
If nodes{y) is empty, then y is null and the assignment nullihes a field of a;; we discuss 
this situation at the end of this case. Finally, if nodes{x) and nodes{y) each contain 
multiple nodes, then we analyze each pair of nodes G nodes{x) and Uy G nodes{y), 
and combine the results. We next show the analysis of the store statement for such a 
pair (ux, Uy) of nodes. 

Step 1. First, the algorithm invalidates the quantitative information that no longer holds 
because of the destructive update, and produces a tuple /" = ((TV, E”), {A” , P” , R”)). 
To determine what pieces of information to invalidate, the analysis must compute the 
new points-to relations. Assume that, before the statement, the points-to information for 
Ux is E{nx) = (mi, M 2 ), where m, M 2 € TV U {_L, T}. Then: 

E” = Elux'-^ {ny,U2)] 

Then, the analysis computes a set TV C TV of nodes that may reach the quantitative 
information regarding skewing and height must be invalidated (killed) for all nodes in 
K. The set K is dehned inductively as follows: 



- if E"{n) = {p, q) and (a; G p U q) V (p = T) V (g = T), then n G K; 

- if E''{n) = (p, q) and (p G TV) V (g G TV), then n G TV; 
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The information after invalidating skews and heights, and after updating reference 
counts is: 

A' = A[{n,s) ^ T]ndK [{ny,c) ^ A{ny,c) + 1] [{ui G iVj) ^ (mi,c) 

1 -^ A(ui, c) — 1] 

P" = P[n^(0,0,P3(n)]„6K 

R" = R [{n.h,m.h) T]n(zK,mGNu{±} [{m.h,n.h) T]n(zK,mGNu{±} 

Step 2. In the second step, the analysis enforces the consistency relation from Sec- 
tion 3.3. The algorithm looks for nodes n such that: 

(Cl) the skew is unknown: n.s = T ; 

(C2) the new points-to relation is precisely known: H''{n) = (p, q), with p ^ T and 
q^T; 

(C3) the height difference between its children is precisely known: R”{p.h,q.h) = 
c G Z. 

In that case, the algorithm derives new quantitative information for n as follows: 

A{n, s) = c 
B{n) = (-1 < c < 1) 
s\n) = (A(n, v) = c) 

R{n.h,p.h) = 1, if c > 0 
R{n.h,q.h) = 1, if c < 0 

and computes the closure of the set of linear equations. The algorithm repeatedly per- 
forms this process until none of the nodes meet the above criteria. Note that deriving 
new information for a node may make the criteria for other nodes become true, which is 
why the analysis must iterate. The result of the fixed-point process is the final informa- 
tion I' . This completes the analysis of this step and of the statement for a pair {ux: riy) 
of nodes. 

We briefly discuss the case where there is no node in N that contains y before the 
statement x.f = y. In other words y is null (so this situation is similar to handling a 
statement of the form x.f = null, if such a statement were in the language). The only 
changes occur in Step 1, in the places where Uy shows up in the formulas of E” and 
A” . These formulas become: 

E” = E[nx^ (-L, U2)] 

A” = A [n.s = T]„gif [(mi g N^) ^ {ui,c) ^ A{ui,c) - 1] 

Case I x.v = c] This case represents assignments that update the integer fields of 
heap sfrucfures. The analysis updates the attribute v and the truth value of the skewing 
consistency predicate for all of the nodes that contain x. The resulting information is 
E = ((7V,P),(A',P',P)),where: 

A' = A [(n, v) ^ c]x(^n 

p' = p[n^ (Pi(n),(A(n, s) = c), P3(n))]a;G„ 
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Case if {x.v == c) This case tests the value of the integer field of an element 
of a structure. The analysis incorporates this information into the abstraction on the 
true branch. On the false branch, it leaves the abstraction unchanged. The resulting 
information on the true branch is /' = ((TV, E), {A' , P, R)), where: 

A' = A [(n, v) 1-^ c]x(^n [P 2 {n) ^ (n, s) c]xen 



Hence, only attributes change: the algorithm sets the value attribute of nodes of x to 
c, and, if the skewing consistency condition is true, it also sets the s attributes to c. 
Therefore, test conditions provide a way to gather information about the skewing of the 
constructed structure. This case completes the presentation of the analysis of statements 
in the program. 



3.6 Merge Operation 

To combine two pieces of abstract information at join points in the control-flow, the 
analysis uses the following merge operation. For top and bottom values, it uses the 
standard relations: U I = I U Ti = Ti and _Lj U J = / U _Lj = /. Given 

Ii = (Gi,Qi) and I2 = (G2, Q2), the merge / = Ii U /2 is a pair (G, Q) which takes 
the union of the nodes in Ii and I2 and takes the point- wise join for components in the 
tuple. More precisely, if Gi = (Ni,Ei) and G2 = {N2, E2), then G = {N, E), where: 

N = Ni\JN2 

( El (n) U E2 (n) if n € Ni D N2 

E(n) = I Ei {n) if n G iVi - 7V2 

[ E2 (n) if n G iV2 — iVi 

and the join U over elements in iV U {_L, T} is defined as follows: the join of u,v G N 

is u if u = V, and it is T otherwise; _L is the bottom element: _LUM = uU_L = t6, for 
all M G At U {_L, T }; and T is the top element: T Um = uUT = T, for all u. 

The merge operations for A and P similarly take the point-wise join for the func- 
tions they represent. The join of two numeric values inZU{T}iscGZ if both values 
are equal to c, and is T otherwise. The join over T is the boolean “and” operation. The 
merge operation for R takes into account null values: the merge of relations i?i and R 2 
yields auxiliary relations R such that R{ni.h,n 2 -h) = Ri{hu, h 2 i) U i? 2 (A 2 i, ^ 22 ), 
where hij is rii.h if rii G Nj, and _L otherwise, for i,j G {1, 2}. 

It can be shown that the merge operation for the abstract information is idempotent, 
associative, and commutative, because all of the component join operations satisfy these 
properties. Furthermore, the join operation for components over Z U {T } corresponds 
to a flat ordering; and all of the other components, such as V or T, represent finite sets. 
Therefore, we conclude that the resulting abstraction forms a lattice with finite height. 

It can also be shown that the transfer function for statements are monotonic with re- 
spect to the partial ordering induced by the above merge operation. For store statements, 
starting the analysis with less precise information leads to larger sets K, so more infor- 
mation is being killed from the abstraction. Hence, Step 1 is monotonic. Furthermore, 
Step 2 is also monotonic, since every node that satisfies conditions (C1)-(C3) in a less 
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precise abstraction will also satisfy those conditions in a more precise abstraction. For 
load statements, a less precise abstraction produces a less precise result, because the 
information in all components except R remains unchanged for all nodes except Ux', 
for Ux, the generated information is more precise when the analysis starts with a more 
precise abstraction. For auxiliary relations, the analysis generates, in a less precise ab- 
straction, either the same relations, or no new relation, due to the flat ordering of skew 
values for n^. 

3.7 Example 

Figure 3 shows the analysis results for the example from Section 2. Each of the fol- 
lowing lines indicates a statement and the information after that statement. We show 
auxiliary relations sparsely, as sets of linearly independent relations R{n.h, m.h) ^ T. 

We emphasize the key aspects of the computed information. First, held load assign- 
ments, such as y=x .left and z=y . right, create new abstract nodes and generate 
new auxiliary relations. For instance, z=y . right creates node z and its dual z°, and 
generates relations z° .h = z.h -b 1 and y.h = z° .h -b 1. Second, when the analysis 
reaches the store assignment y . right=x, it destroys all of the information that in- 
volves the skew and the height of y and x (because x reaches y after the update). For 
each of them, the analysis sets the skew to top, the invariants B and S to false, and 
removes all auxiliary information about y.h and x.h (but keeps the relations between 
y° -h, z.h, and z°.h). The analysis also tries to infer new information, but it fails to do so 
because none of the nodes satisfy conditions (C1)-(C3). Finally, the algorithm analyzes 
the store assignment x. lef t=z. Again, it invalidates skew and height information, 
but just for X in this case. Then, it determines that x satisfies conditions (C1)-(C3), so 
it computes new information for it: skew value 0, true predicates B and S\ it also gen- 
erates relation x.h = z.h -b 1. The key piece of information that allows the analysis to 
do so is the auxiliary relation between z.h and y°.h (which validates condition (C3)). 
Then y also satishes the conditions; the analysis derives similar information for y (skew 
value of 0 and true predicates B and S) and generates relation y.h = z°.h + 1. 

4 Related Work 

We discuss existing work in the area of verification of linked data structures. We present 
existing techniques based on model checking, theorem proving, abstract interpretation, 
and dataflow analysis. 

Ball et. al. propose a system [3] based on model checking and theorem proving 
techniques to extract invariants for arbitrary C programs, including programs that ma- 
nipulate linked structures, where invariants may contain integer quantities. The system 
consists of: a tool C2 bp [1] that relies on a theorem pro ver to build a boolean abstraction 
of the program relative to a fixed set of user-supplied predicates; a model checker Be- 
bop [2] that analyzes the boolean program; and a tool Newton that infers additional 
boolean predicates when the verihcation fails. For the quantitative shape problems such 
as tree balancing, hguring out the necessary invariants is non-trivial (if possible at all): 
the user would have to manually prove the desired property, identify the invariants that 
the proof requires, and then supply those to the tool. Furthermore, the theorem prover 
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Points-to, Shape, and Quantitative Information 
A P 



(initial information) 



X 1— > (us, ris) 


X ^ 


(2,T,0) 


X 


^ (0,0, 1) 




ris i—> (ns,ris) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1 — ^ (2, T, 0) 


X 


1-^ (0,0, 1) 


y.h = y° .h + 2 


y (rts, rts) 


y 


^ (T,T,1) 


y 


^(1,1,1) 


x.h = y.h + 1 


y° (ns, ns) 


y° 


^ (T,T,1) 


y° 


^ (1,1,1) 




ris i—> (ns,ris) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1 — > (2, T, 0) 


X 


1-^ (0,0,1) 


y,h = .h 2 


y (ns, ns) 


y 


^ (1,1,1) 


y 


^(1,1,1) 


x.h = y.h + 1 


y° (ns, ns) 


y° 


^ (T,T,1) 


y° 


^ (1,1,1) 




ris i—> (ns,ris) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1-^ (2,0,0) 


X 


1-^ (0,0, 1) 


y,h = y° .h 2 


y (ns, ns) 


y 


^ (1,1,1) 


y 


^(1,1,1) 


x.h = y.h + 1 


y° i-> (na,na) 


y° 


^ (T,T,1) 


y"" 


^ (1,1,1) 




ris i-» (na,ria) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1-^ (2,0,0) 


X 


1-^ (0,0, 1) 


y.h = y° .h 2 


y (ns, ns) 


y 


1-^ (1,0,1) 


y 


1-^ (1,0, 1) 


x.h = y.h + 1 


y° (ns, ns) 


y° 


^ (T,T,1) 


y^ 


^ (1,1,1) 




ris i—> (ns,ns) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1-^ (2,0,0) 


X 


1-^ (0,0, 1) 


y.h = y° .h + 2 


y ^ (z°,z) 


y 


^ (1,0, 1) 


y 


1-^ (1,0, 1) 


x.h = y.h + 1 


y° (ns, ns) 


y° 


^ (T,T,1) 


y^ 


^(1,1,1) 


z° .h = z.h + 1 


2 1 — > (ns, ria) 


2 


^ (T,T,1) 


z 


^ (1,1,1) 


y.h = z° .h + 1 


z° i-> (na,ria) 


2° 


^ (T,T,1) 




^ (1,1,1) 




ris i—> (na,ria) 






Us 


^ (1,1,1) 




X ^ (y,y°) 


X 


1— > (T, 0, 1) 


X 


1-^ (0,0, 1) 


z° .h = z.h + 1 


y r^{z°,x) 


y 


1— > (T, 0, 1) 


y 


1-^ (0,0, 1) 


O 

II 


y° (ns, ns) 


y° 


^ (T,T,1) 


y° 


^(1,1,1) 




2 1 — > (ns, ria) 


2 


^ (T,T,0) 


z 


^ (1,1,1) 




z° i-> (ria, ria) 


2° 


^ (T,T,1) 


z° 


^ (1,1,1) 




ris i—> (ris, ris) 






Us 


^ (1,1,1) 




X ^ (z,y°) 


X 


^ (0,0, 1) 


X 


^ (1,1,1) 


z° .h = z.h -|- 1 


y (z°,x) 


y 


1-^ (0,0,0) 


y 


^ (1,1,1) 


z.h = y° .h 


y° (ria, ria) 


y° 


^ (T,T,1) 


y° 


^(1,1,1) 


x.h = z.h + 1 


z i-> (ris, ris) 


z 


^ (T,T,1) 


z 


^ (1,1,1) 


y.h = z° .h + 1 


Z° 1— > (ria, ria) 


2° 


^ (T,T,1) 


z° 


^ (1,1,1) 




ris i—> (ris, ris) 






ris 


^ (1,1,1) 





x.left = z 



Fig. 3. Analysis for the example. The table shows the analysis information after each instruction. 
We represent auxiliary relations in R sparsely, as a set of linearly independent relations. 



must have knowledge about the relation between quantities (e.g., height, skew) and 
points-to relations. Our analysis does all this work automatically, at the expense of be- 
ing specialized to solve this problem. 
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Yavuz-Kahveci and Bultan [19] propose a model checking technique for the veri- 
hcation of programs that concurrently manipulate shared list structures with counters. 
Their model uses shape graphs, represented using BDDs and augmented with counters 
for summary nodes, and numeric constraints over Presburger arithmetic. Their abstrac- 
tion is not finite; to enforce termination, their system artificially bounds the number of 
iterations and uses widening operators. This approach is limited to structures with one 
selector, i.e., lists, and properties about numbers of nodes. In contrast, our technique 
is able to verify more complex properties for more complex structures, and it does so 
using a hnite abstraction that automatically guarantees termination. 

A large body of research has been devoted to shape analysis for programs with 
destructive updates, using dataflow analysis or abstract interpretation to verify various 
heap invariants, including aliasing, sharing, cyclicity, or reachability [18]. The tech- 
niques presented in this paper are most related to those shape analyses that distinguish 
between heap locations based on their points-to relationships with respect to program 
variables [16,9,4], and to the similar analyses which formulate the shape abstraction 
as 3-valued logic formulas [18, 5, 14, 17]. Although the transfer functions in all of these 
analyses must be constructed by the analysis designer, recent work shows that it is also 
possible to derive transfer functions automatically [13]. The above techniques focus on 
verifying shape invariants such as sharing or cyclicity; the 3 -valued logic framework 
has also been used to prove other invariants, such as the “is-sorted” invariant for list 
sorting procedures [10]. However, the 3-valued logic abstraction is based on hrst-order 
predicate logic augmented with transitive closure and, since this is a weaker theory 
than arithmetic, 3-valued logic is not powerful enough to verify quantitative properties 
such as the balancing of tree structures. Compared to all of this work, our analysis ex- 
tends traditional shape analysis with quantitative heap invariants, and is able to verify 
such properties for programs where destructive updates may temporarily invalidate both 
shape and quantitative invariants. 

Other approaches exclusively use theorem proving for the verification of invariants 
of linked structures. Mpller and Schwartzbach propose the Pointer Assertion Logic En- 
gine [1 1], a system that verifies non-arithmetic invariants for recursive heap structures 
that can be expressed using graph types [7]. The tool requires users to annotate loops 
and procedures with appropriate invariants; it uses Hoare logic to generate verifica- 
tion conditions that are checked by MONA [12], a theorem prover based on automata. 
In contrast, our approach does not require loop annotations and, more important, can 
identify arithmetic invariants. Recently, Kolpylov has proposed a type-theoretical ap- 
proach to the verihcation of data structures using dependent intersection types [8]. This 
approach has been implemented in the theorem-proving system MetaPRL [6] and used 
to verify all properties of red-black trees, including the arithmetic manipulations. How- 
ever, this work targets a functional implementation of red-black trees, and does not 
apply to programs with destructive updates. 

5 Conclusions and Future Work 

We have presented a quantitative analysis algorithm that is able to verify balancing in- 
variants for tree structures. Our algorithm can successfully analyze programs where de- 
structive operations temporarily invalidate both shape and quantitative invariants; thus. 
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it generalizes techniques that compute shape information alone. We have used a heap 
abstraction that represents shape and quantitative information in an unihed framework. 

Although the proposed algorithm targets properties that refer to balancing and 
height of tree structures, this is a first step toward developing more general analyses 
that reason about quantitative invariants of heap structures. We believe that our pro- 
posed analysis framework (based on quantitative attributes, predicates, and auxiliary 
information) and the proposed analysis principles (of specializing and simplifying the 
abstraction and the analysis as much as possible to solve a particular problem) can be 
applied to other quantitative shape problems. 

Hence, one direction of future research is to explore a range of other quantitative 
properties and develop analyses capable of proving such properties. A hrst category 
includes properties about the balancing and height of other tree structures, such as red- 
black trees, B-trees, or splay trees. For instance, proving the red-black property that 
every path from each node to the leaves contains the same number of black nodes re- 
quires introducing attributes to represent the node colors and attributes to record the 
minimum and maximum number of black nodes on all paths to the leaves. A second 
category of properties refers to quantities other than balancing and height; an exam- 
ple would be to prove that binary search invariants are being maintained. Finally, other 
categories include quantitative invariants for structures other than trees and invariants 
between quantities of multiple structures. For instance, proving that a linked list con- 
tains exactly the leaves of some other tree structure. 

Since the philosophy of our proposed approach is to develop specialized analy- 
ses, proving different properties will require different abstractions and different transfer 
functions. Hence, another direction of future work is to develop a general specihcation 
language that succinctly describes quantitative analyses in our framework, and provide 
tools that automatically generate analyzers from those specihcations. 

Finally, since the analyses can become complex and heavyweight, it also becomes 
difficult to prove their correctness. Therefore, a third direction of future research is 
to explore the use other tools (e.g., theorem-provers) to automatically prove the cor- 
rectness of the analysis, or, even better, develop tools that automatically derive correct 
analyses from the abstraction and the concrete semantics of the language. 
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Abstract. This paper addresses the verification of properties of imperative pro- 
grams with recursive procedure calls, heap-allocated storage, and destructive up- 
dating of pointer-valued fields - i.e., interprocedural shape analysis. It presents 
a way to harness some previously known approaches to interprocedural dataflow 
analysis - which in past work have been applied only to much less rich settings - 
for interprocedural shape analysis. 



1 Introduction 

This paper concerns techniques for static analysis of recursive programs that manipulate 
heap-allocated storage and perform destructive updating of pointer-valued fields. The 
goal is to recover shape descriptors that provide information about the characteristics 
of the data structures that a program’s pointer variables can point to. Such information 
can be used to help programmers understand certain aspects of the program’s behavior, 
to verify properties of the program, and to optimize or parallelize the program. 

The work reported in the paper builds on past work by several of the authors on static 
analysis based on 3-valued logic [1,2] and its implementation in the TVLA system [3]. 
In this setting, two related logics come into play: an ordinary 2-valued logic, as well as a 
related 3-valued logic. A memory configuration, or store, is modeled by what logicians 
call a logical structure, which consists of a predicate (i.e., a relation of appropriate 
arity) for each predicate symbol of a vocabulary V. A store is modeled by a 2-valued 
logical structure; a set of stores is abstracted by a (finite) set of bounded-size 3-valued 
logical structures. An individual of a 3-valued structure’s universe either models a single 
memory cell or, in the case of a summary individual, a collection of memory cells. 

The constraint of working with limited-size descriptors entails a loss of information 
about the store. Certain properties of concrete individuals are lost due to abstraction, 
which groups together multiple individuals into summary individuals: a property can 
be true for some concrete individuals of the group but false for other individuals. It 
is for this reason that 3-vaIued logic is used; uncertainty about a property’s value is 
captured by means of the third truth value, 1 /2. 

One of the opportunities for scaling up this approach is to exploit the compositional 
structure of programs. In interprocedural dataflow analysis, one avenue for accomplish- 
ing this is to create a summary transformer for each procedure P, and use the summary 
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transformer at each call site at which P is called. Each summary transformer must 
capture (an over-approximation of) the net effect of a call on P. To he able to create 
summary transformers, the abstract transformers for individual transitions must have 
a “composable representation”; that is, given the representations of two abstract trans- 
formers, it must be possible to represent their composition as an object of roughly the 
same size. One then carries out a fixed-point-finding procedure on a collection of equa- 
tions in which each variable in the equation set has a transformer-valued value - i.e., a 
value drawn from the domain of transformers - rather than a dataflow value proper. 

A number of approaches to interprocedural dataflow analysis based on summary 
transformers are known [4-9]. However, not all program-analysis problems have ab- 
stract transformers that have a composable representation. 

For some problems, it is possible to address this issue by working pointwise, tabulat- 
ing composed transformers as sets of pairs of input/output values [7, 8, 10]. However, 
for interprocedural shape analysis, this approach fails to produce useful information. 
The 3-valued-logic approach to shape analysis is a storeless one: individuals, which 
model memory cells, do not have fixed identities; they are identified only up to their 
“distinguishing characteristics”, namely, their values for a specific sef of unary predi- 
cates. Because these “distinguishing characteristics” can change during the course of 
a procedure call, there is no way to identify individuals in an input abstract structure 
with their corresponding individuals in the output abstract structure. In essence, a pair 
of input/output 3-valued structures loses track of the correlations between the input and 
output values of an individual’s unary predicates. Consequently, an approach based on 
tabulating composed transformers as sets of pairs of 3-valued structures is not promis- 
ing: the representation provides only a weak characterization of a procedure’s net effect. 

All is not lost, however: instead of “abstracting and then pairing” (as discussed 
above), the solution is to “pair and then abstract”. 

Observation 1 By using 3-valued structures over a doubled vocabulary V l±) V , where 
V' = {p' I p G V'l and l±l denotes disjoint union, one obtains a finite abstraction 
that relates the predicate values for an individual at the beginning of a transition to the 
predicate values for the individual at the end of the transition. 

This abstraction provides a way to create much more accurate composable represen- 
tations of transformers, and hence much more accurate summary transformers, for a 
broad class of problems. Moreover, by extending the abstract domain of 3-valued logi- 
cal structures [1] with some new operations, it is possible to perform abstract interpre- 
tation of call and return statements without losing too much precision (see §4). We have 
used these ideas to create a context-sensitive shape-analysis algorithm for recursive 
programs that manipulate heap-allocated storage and perform destructive updating. 

Context-sensitive interprocedural shape analysis was also studied in [11]. A major 
difference is that [11] augments the store to include the runtime stack as an explicit 
data structure (an idea proposed in [12, 13]); the storage abstraction used in [11] is an 
abstraction of the store augmented in this fashion. In contrast, in our work the stack is 
not materialized as an explicit data structure; our approach is based on the creation of 
summary transformers, in the style of [4-6]. 

The contributions of our work include the following: 
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- It provides a method to create a summary transformer for each procedure P, which 
can be used at each call site at which P is called. 

- Our analysis obtains more general information than that obtained in [11]: 

• In [11], the result of the analysis for the exit node ep of a procedure P is (an 
approximation of) the reachable memory configurations that can arise at ep 

• In this paper, the result for ep is (an approximation of) the relation between the 
input memory configurations at the start node sp of P and the configurations 
at ep, restricted to the memory configurations that are reachable at sp. 

Because of the different nature of the information obtained, our analysis is able to 
verify that reversing a list twice restores the original list, whereas the method of 
[11] would only show that it yields a list with the same head and the same set of 
memory cells (in some order). 

- We have been able to apply our method successfully to a richer set of programs. 
In particular, [11] only studied how to perform interprocedural analysis for recur- 
sive Zwt-manipulation programs. The method described in this paper is capable of 
handling certain programs that manipulate binary trees. (While list-manipulation 
programs can often be implemented in tail-recursive fashion - and hence can be 
converted easily Into loop programs - tree-manipulation programs are much less 
easily converted to non-recursive form.) 

The remainder of the paper is organized as follows: §2 describes the features of 
the language to which our analysis applies. §3 reviews the abstract domain of 3-valued 
logical structures [1]. §4 describes how abstractions of logical structures over a dou- 
bled vocabulary are used to create summary transformers and perform interprocedural 
analysis. §5 discusses experimental results. §6 discusses related work. 



2 Programs and Memory Configurations 



The analysis applies to pro- 
grams written in an impera- 
tive programming language in 
which (i) it is forbidden to take 
the address of a local vari- 
able, global variable, or param- 
eter; and (ii) parameters are 
passed by value. These two 
features prevent direct alias- 
ing among variables; thus, only 
heap-allocated structures can 
be aliased. (Both Java and Ml 
follow these conventions.) The 
running example used in the 
paper is the list-reversal pro- 
gram of Fig. 1 . 



typedef struct node! 
struct node *n; 
int data; 

1 *List; 

List res; 

void main (List 1) { 
res = rev (1) ; 

1 



List rev (List x) { 
List y, z; 
z = X - >n ; 
x->n = NULL; 
if (z != NULL) { 
Y = rev ( z ) ; 
z->n = x; 

} 

else y = x; 
return y; 



Fig. 1. Recursive list-reversal program. 
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2.1 Program Syntax 

A program is defined by a set of proce- 
dures Pi, 0 < i < K. Each procedure 
has a set of local variables, and has a 
number of formal input parameters and 
output parameters. To simplify our nota- 
tion, we will assume that each procedure 
has only one input (resp. output) param- 
eter and one local variable; the general- 
ization to multiple parameters and local 
variables is straightforward. We also as- 
sume that an input parameter is not mod- 
ified during the execution of the proce- 
dure. (This assumption is made solely for 
convenience, and involves no loss of gen- 
erality because it is always possible to 
copy input parameters to additional lo- 
cal variables.) Thus, a procedure Pi = 

(fpij, fpOj, loci, Gi) is defined by its in- 
put parameter fpi^, its output parameter fpo^, its local variable loCi, and Gi, its intrapro- 
cedural control flow graph (CFG). 

A program is represented by a directed graph G* = {N* , E*) called an interpro- 
cedural CFG. G* consists of a collection of intraprocedural CFGs Gi, G 2 , . . . , Gk, 
one of which, Gmain, represents the program’s main procedure. Each CFG Gi contains 
exactly one start node Si and exactly one exit node e^. The other nodes of a CFG rep- 
resent individual statements and branches of a procedure in the usual way', except that 
a procedure call is represented by two nodes, a call node and a return-site node. For 
n G N* , proc{n) denotes the (index of the) procedure that contains n. In addition to 
the ordinary intraprocedural edges that connect the nodes of the individual flowgraphs 
in G*, each procedure call, represented by call-node c and return-site node r, has two 
edges: (i) a call-to-start edge from c to the start node of the called procedure, and (ii) an 
exit-to-return-site edge from the exit node of the called procedure to r. The functions 
call and ret record matching call and return-site nodes: call{r) = c and ret{c) = r. We 
assume that a start node has no incoming edges except call-to-start edges. 

2.2 Representing Memory Configurations with Logical Structures 

As in the static-analysis framework defined in [1], concrete memory configurations - 
or stores - are modeled by logical structures. A logical structure is associated with a 
vocabulary of predicate symbols (with given arities): V = {eq,pi, . . . ,pn} is a finite 
set of predicate symbols, where denotes the set of predicate symbols of arity k (and 
eq G V2). A logical structure supplies a predicate for each of the vocabulary’s pred- 
icate symbols. A concrete store is modeled by a 2-valued logical structure for a fixed 

* Alternatively, nodes can represent basic blocks. 




Fig. 2. Interprocedural CFG of the list-reversal 
program. 
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Table 1. Core predicates used for representing the stores manipulated by programs that use type 
List. (We write predicate names in italics and code in typewriter font.) 



Predicate 


Intended Meaning 


eq{vi,V2) 

q{v) 

n{vi,V2) 

dle{vi,V2) 


Do vi and V 2 denote the same memory cell? 
Does pointer variable q point to memory cell v? 
Does the n-field of vi point to U 2 ? 

Is the data-field of vi < the data-field of V 2 ? 



vocabulary of core predicates, C. Core predicates are part of the underlying semantics 
of the language to be analyzed; they record atomic properties of stores. For instance, 
Tab. 1 lists the predicates that would be used to represent the stores manipulated by 
programs that use type List from Fig. 1, such as the store shown in Fig. 3. 2-valued 
logical structures then represent memory configurations: the individuals are the set of 
memory cells; a nullary predicate represents a Boolean variable of the program; a unary 
predicate represents either a pointer variable or a Boolean- valued field of a record; and 
a binary predicate represenfs a pointer field of a record^. 

The 2-valued structure S, shown in 
the left-hand side of Fig. 4, encodes the 
store of Fig. 3. S'’s four individuals, u\, 

U 2 , it 3 , and U 4 , represent the four list 
cells. 

The following graphical notation is 
used for depicting 2- valued structures: 



J 5 I — I 2 I — I — I 9 I H — I 3 I ^ — • NULL 

Fig. 3. A possible store, consisting of a four- 
node linked list pointed to by x and y. 



- An individual is represented by a cir- 
cle with its name inside. 

- A unary predicate p is represented by having a solid arrow from p to each individual 
u for which p{u) = 1, and by the absence of a p-arrow to each individual u' for 
which p{u') = 0. (If predicate p is 0 for all individuals, p is not shown.) 

- A binary predicate q is represented by a solid arrow labeled q between each pair 
of individuals Ui and Uj for which q{ui, Uj) = 1, and by the absence of a q-arrow 
between pairs u[ and it' for which (?(it', it' ) = 0. 



Thus, in structure S, pointer variables x and y point to individual iti, whose n-field 
points to individual U 2 \ pointer variable z does not point to any individual. 

Often we only want to use a restricted class of logical structures to encode stores. 
To exclude structures that do not represent admissible stores, integrity constraints can 
be imposed. For instance, the predicate x{v) of Fig. 4 captures whether pointer variable 
X points to memory cell v\ x would be given the attribute “unique”, which imposes the 
integrity constraint that x{y) can hold for at most one individual in any structure. 

^ To simplify matters, our examples do not involve modeling numeric-valued variables and 
numeric- valued fields (such as data). It is possible to do this by introducing other predi- 
cates, such as the binary predicate die (which stands for “data less-than-or-equal-to”) listed 
in Tab. 1; die captures the relative order of two nodes’ data values. Alternatively, numeric- 
valued entities can be handled by combining abstractions of logical structures with previously 
known techniques for creating numeric abstractions [14]. 
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unary preds. binary preds. 




unary preds. binary preds. 



indiv. 


x\y\z 


u 


1|1|0| 


u' 


0|0|0| 




X — 




T 



n \u 


u' 


eq\u\ u' 


n'wm 






Fig. 4. The abstraction of 2-valued structure S to 3- valued structure T when we use {x,y,z}- 
abstraction. 



The concrete operational semantics of a programming language is defined by spec- 
ifying a structure transformer for each kind of edge e that can appear in a control-flow 
graph. Formally, the structure transformer Te for edge e is defined using a collection of 
predicate-update formulas, c(vi , . . . , Vk) = Tc,e(vi, ■ • ■ , Vk), one for each core predi- 
cate c (e.g., see [1]). These formulas define how the core predicates of a logical structure 
S that arises at the source of e are transformed by e to create a logical structure S' at 
the target of e; they define the value of predicate c in S' as a function of c’s value 
in S. Edge e may optionally have a precondition formula, which filters out structures 
that should not follow the transition along e. (In Fig. 2, edges are labeled with state- 
ments and conditions of the programming language, rather than with such collections 
of predicate-update formulas.) 

The set of all 2-valued structures over vocabulary V is denoted by S 2 \P] ■ 



3 The Abstract Domain of 3- Valued Logical Structures 

To create abstractions of 2-valued logical structures (and hence of the stores that they 
encode), we use the related class of 3-valued logical structures over the same vocab- 
ulary. In 3-valued logical structures, a third truth value, denoted by 1 /2, is introduced 
to denote uncertainty: in a 3-valued logical structure, the value p{u) of predicate p on 
a tuple of individuals u is allowed to be 1 /2. The set of all 3-valued structures over 
vocabulary V is denoted by SslV]. (We drop “[7^]” when V is clear from the context.) 

Definition 1. The truth values 0 and 1 are definite values', 1 /2 is an indefinite value. 
For li,l 2 G {0, 1/2, 1}, the information order is defined as follows: l\ C I 2 iff l\ = I 2 
or I 2 = 1/2. The symbol U denotes the least-upper-bound operation with respect to C. 

The abstract stores used for program analysis are 3-valued logical structures that, by 
the construction discussed below, are a priori of bounded size. In general, each 3-vaIued 
logical structure corresponds to a (possibly infinite) set of 2-valued logical structures. 
Members of these two families of structures are related by canonical abstraction. 

The principle behind canonical abstraction is illustrated in Fig. 4, which shows how 
2-valued structure S is abstracted to 3-valued structure T. The abstraction function 
is determined by a subset A of the unary predicates. The predicates in A are called 
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the abstraction predicates. Given A, the act of applying the corresponding abstrac- 
tion function is called A-abstraction. The canonical abstraction illustrated in Fig. 4 is 
{x, y, z}-abstraction. 

Abstraction is driven by the values of the “vector” of abstraction predicates for each 
individual w - i.e., for S, by the values x{w), y{w), and z{w), form £ {^1,^2, M3, M4} 
- and, in particular, by the equivalence classes formed from the individuals that have 
the same vector of values for their abstraction predicates. In S, there are two such 
equivalence classes: (i) {mi}, for which x, y, and z are 1, 1, and 0, respectively, and 
(ii) {m 2 , M3, M4}, for which x, y, and z are all 0. (The boxes in the table of unary pred- 
icates for S show how individuals of S are grouped into two equivalence classes.) All 
of the members of each equivalence class are mapped to the same individual of the 
3-valued structure. Thus, all members of {m 2 , M3, M4} from S are mapped to the same 
individual in T, called u' similarly, all members of {mi} from S are mapped to the 
same individual in T, called u. 

For each non-abstraction predicate of 2-valued structure S, the corresponding 
predicate in 3-valued structure T is formed by a “truth-blurring quotient”. The value 
for a tuple Uq in is the join (U) of all tuples that the equivalence relation on 
individuals maps to ttg. For instance, 

- In S, n^{ui,ui) equals 0. Therefore, the value of vA {u, u) is 0. 

- In S, n^{u 2 ,ui), n^{u 3 ,ui), and n^{u 4 ,ui) all equal 0. Therefore, the value of 
n'^{u', m) is 0. 

- In S, m'^(mi, M 3) and n‘®(Mi, M4) both equal 0, whereas m'^(mi, M 2) equals 1; there- 
fore, the value of iA'{u, u') is 1/2 (= 0 U 1). 

- In S, n^{u 2 ,U 3 ) and n^{u 3 ,U 4 ) both equal 1, whereas n^(u 2 ,U 2 ), n^(u 2 ,U 4 ), 
n^(u 3 , M2), n^(u 3 , M3), n^(u 4 , M2), n^(u 4 , M3), and n^(u 4 , U 4 ) all equal 0; there- 
fore, the value of ’nA[u' , u') is 1/2 (= 0 U 1). 

In Fig. 4, the boxes in the tables for predicates n and eq indicate these four groupings. 

In a 2-valued structure, the eq predicate represents the equality relation on indi- 
viduals. In general, under canonical abstraction some individuals “lose their identity” 
because of uncertainty that arises in the eq predicate. For instance, eq^{u, m) = 1 be- 
cause M in T represents a single individual of S. On the other hand, u' represents three 
individuals of S and the quotient operation causes eq"^{u',u') to have the value 1/2. 
An individual like u' is called a summary individual. 

A 3-valued logical structure T is used as an abstract descriptor of a set of 2-valued 
logical structures. In general, a summary individual models a set of individuals in each 
of the 2-valued logical structures that T represents. The graphical notation for 3-valued 
logical structures (cf. structure T of Fig. 4) is derived from the one for 2-valued struc- 
tures, with the following additions: 

- Individuals are represented by circles containing their names. (In Fig. 5, discussed 
in §5, we also place non-0 -valued unary predicates that do not correspond to 
pointer-valued program variables inside the circles.) 

^ The names of individuals are completely arbitrary: what distinguishes u' is the value of its 
vector of abstraction predicates. 
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- A summary individual is represented by a double circle. 

- Unary and binary predicates with value 1 /2 are represented by dotted arrows. 

Thus, in every concrete structure S that is represented by abstract structure T of Fig. 4, 
pointer variables x and y definitely point to the concrete node of S that u represents. 
The n-field of that node may point to one of the concrete nodes that u' represents; u' is 
a summary individual, i.e., it may represent more than one concrete node in S. Possibly 
there is an n-field in one or more of these concrete nodes that points to another of the 
concrete nodes that u' represents, but there cannot be an n-field in any of these concrete 
nodes that points to the concrete node that u represents. 

Note that 3-valued structure T also represents 

- the acyclic lists of length 3 or more that are pointed to by x and y. 

- the cyclic lists of length 3 or more that are pointed to by x and y, such that the 
backpointer is not to the head of the list, but to the second, third, or later element. 

- some additional memory configurations with a cyclic or acyclic list pointed to by x 
and y that also contain some garbage cells that are not reachable from x and y. 

That is, T is a finite representation of an infinite set of (possibly cyclic) concrete lists, 
each of which may also be accompanied by some unreachable cells. Later in this sec- 
tion, we discuss options for fine-tuning an abstraction. For instance, it is possible to use 
canonical abstraction to define abstractions in which the acyclic lists and the cyclic lists 
are mapped to different 3-valued structures (and in which the presence or absence of 
unreachable cells is readily apparent). 

Canonical abstraction ensures that each 3-valued structure has an a priori bounded 
size, which guarantees that a fixed-point will always be reached by an iterative static- 
analysis algorithm. Another advantage of using 2- and 3-valued logic as the basis for 
static analysis is that the language used for extracting information from the concrete 
world and the abstract world is identical: every syntactic expression - i.e., every logical 
formula - can be interpreted either in the 2-valued world or the 3-valued world"*. 

The consistency of the 2-valued and 3-valued viewpoints is ensured by a basic the- 
orem that relates the two logics, which eliminates the need for the user to write the 
usual proofs required with abstract interpretation - i.e., to demonstrate that the abstract 
descriptors that the analyzer manipulates correctly model the actual heap-allocated data 
structures that the program manipulates. Thanks to a single meta-theorem (the Em- 
bedding Theorem [1, Theorem 4.9]), which shows that information extracted from a 
3-valued structure T by evaluating a formula ip is sound with respect to the value of 
(fi in each of the 2-valued structures that T represents, an abstract semantics falls out 
automatically from a specification of the concrete semantics (which has to be provided 
in any case whenever abstract interpretation is employed). In particular, the formulas 
that define the concrete semantics when interpreted in 2-valued logic define a sound 
abstract semantics when interpreted in 3-valued logic. Soundness of all instantiations 
of the analysis framework is ensured by the Embedding Theorem. 

* Formulas are first-order formulas with transitive closure: a formula over the vocabulary V = 
{eq,pi, . . . ,Pn} is defined as follows (where p*{vi,V 2 ) stands for the reflexive transitive 
closure of p (ui , U 2 ) ) : 

p eV,p € Formulas, p ::= 0 | 1 | p(ui, . ..,Vk)\ (“'<Fi) I {pi A P 2 ) | {pi V ^ 2 ) 

V € Variables \ (3u: pi) \ (Vu: pi) \ p*{vi,V2) 
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Instrumentation Predicates. Unfortunately, unless some care is taken in the design of 
an analysis, there is a danger that as abstract interpretation proceeds, the indefinite value 
1/2 will become pervasive. This can destroy the ability to recover interesting informa- 
tion from the 3-valued structures collected (although soundness is maintained). A key 
role in combating indefiniteness is played by instrumentation predicates, which record 
auxiliary information in a logical structure. They provide a mechanism for the user to 
fine-tune an abstraction: an instrumentation predicate p of arity k, which is defined by 
a logical formula ... ,Vk) over the core predicate symbols, captures a property 

that each fc-tuple of nodes may or may not possess. In general, adding additional in- 
strumentation predicates refines fhe absfracfion, defining a more precise analysis fhat 
is prepared fo track finer distinctions among stores. This allows more properties of the 
program’s stores to be identified during analysis. 

Table 2. Defining formulas of some commonly used instrumentation predicates. Typically, there 
is a separate predicate symbol r[n, q] for every pointer-valued variable q. 



p 


IntendedMeaning 


t/>p 


t[n]{vi,V2) 

r[n,q]{v) 

c[n\{v) 


Is V2 reachable from ui along n-fields? 

Is V reachable from pointer variable q along n-fields? 
Is w on a directed cycle of n-fields? 


n* {vi,V2) 

3 : g(wi) A u) 

3 : n(v,ui) Af[n](t;i,t;) 



The introduction of unary instrumentation predicates that are then used as abstrac- 
tion predicates provides a way to control which concrete individuals are merged to- 
gether into summary nodes, and thereby to control the amount of information lost by 
abstraction. Instrumentation predicates that involve reachability properties, which can 
be defined using transitive closure, often play a crucial role in the definitions of ab- 
stractions. For instance, in program-analysis applications, reachability properties from 
specific poinfer variables have fhe effect of keeping disjoint sublists or subtrees summa- 
rized separately. This is particularly important when analyzing a program in which two 
pointers are advanced along disjoint sublists. Tab. 2 lists some instrumentation predi- 
cates that are important for the analysis of programs that use type List. 

From the standpoint of the concrete semantics, instrumentation predicates represent 
cached information that could always be recomputed by reevaluating the instrumenta- 
tion predicate’s defining formula in fhe current store. From the standpoint of the abstract 
semantics, however, reevaluating a formula in the current (3-valued) store can lead to 
a drastic loss of precision. To gain maximum benefit from instrumentation predicates, 
an abstract-interpretation algorithm must obtain their values in some other way. This 
problem, the instrumentation-predicate-maintenance problem, is solved by incremen- 
tal computation; the new value that instrumentation predicate p should have after a 
transition via abstract state transformer r from state cr to a' is computed incrementally 
from the known value of p in a. An algorithm that uses r and p’s defining formula 
tpp{vi, . . . ,Vk) to generate an appropriate incremental predicate-maintenance formula 
for p is presented in [2]. 

The problem of automatically identifying appropriate instrumentation predicates, 
using a process of abstraction refinement, is addressed in [15]. In that paper, the input 
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required to specify a program analysis consists of (i) a program, (ii) a characterization 
of the inputs, and (iii) a query (i.e., a formula that characterizes the intended output). 
That work, along with [2], provides a framework for eliminating previously required 
user inputs for which TVLA has been criticized in the past. Although the abstraction- 
refinement mechanism was not available for the experiments reported on in the present 
paper, we believe that it will work equally well when applied to the analysis of programs 
with recursive procedure calls. In particular, we have observed that the abstraction- 
refinement mechanism is capable of generating instrumentation predicates that record 
in/out relationships: most of the experiments described in [15] involved 2-vocabulary 
structures similar to those used in the present paper, and several of the instrumentation 
predicates identified relate pairs of predicates p[in]/p\out\. 

Other Operations on Logical Strnctures. Thanks to the fact that the Embedding The- 
orem applies to any pair of structures for which one can be embedded into the other, 
most operations on 3-valued structures need not be constrained to manipulate 3-valued 
structures that are images of canonical abstraction. Thus, it is not necessary to per- 
form canonical abstraction after the application of each abstract structure transformer. 
To ensure that abstract interpretation terminates, it is only necessary that canonical ab- 
straction be applied as a widening operator somewhere in each loop, e.g., at the target 
of each backedge in the CFG. 

Several additional operations on logical structures help prevent an analysis from 
losing precision: 

- Focus is an operation that can be invoked to elaborate a 3-valued structure - allow- 
ing it to be replaced by a set of more precise structures (not necessarily images of 
canonical abstraction) that represent the same set of concrete stores. 

- Coerce is a clean-up operation that may “sharpen” a 3-valued logical structure by 
setting an indefinite value (1/2) to a definite value (0 or 1), or discard a structure 
entirely if the structure exhibits some fundamental inconsistency (e.g., it cannot 
represent any possible concrete store). 



4 The Use of Logical Structures for Interprocedural Analysis 

Given an abstract value Aq that represents a set of initial stores, the goal is to compute 
- for each control point of each procedure - an overapproximation to the set of values 
for the local variables and the heap that can arise at that point. More precisely, the goal 
is to compute the “join-over-valid-paths” value for each node n: 

JOVP(n) = □ pf,(Ao) 

q G ValidPaihs ( , n ) 

where ValidPaths (smom, n) denotes the set of paths from Smain to n in which the call-to- 
start and exit-to-return-site edges in path q form a string in which each exit-to-return-site 
edge is balanced by a preceding call-to-start edge, and pf^ is the composition, in order, 
of the dataflow transformers for the edges of q. 

Let Id\ D denote the identity transformer restricted to inputs in D. For dataflow trans- 
formers that distribute over U, the JOVP solution can be obtained by finding the least 
solution to the following set of equations: 
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<l>{smam) = Id\Ag Aq describes the set of initial stores at Smain (1) 

= M\d Sp e StartNodes, p 7 ^ main, and D = | | range(0(c)) (2) 

( c, Sp ) G CallToStartEdges 

Hn) = y o cf)(m) for n G N*, n ^ (ReturnSites U StartNodes) (3) 

{m,n)G-E* 

(j){n) = <j>{eq) o (j){call[n)) for n G ReturnSites, and ca/Z(n) calls g (4) 

Eqns. (l)-(4) can be understood as a variant of the “functional approach” of Shark 
and Pnueli [5]; in [5], this is expressed with two fixed-point-finding phases: the first 
phase propagates transformer- valued values; the second phase propagates dataflow val- 
ues proper. Eqns. (l)-(4) combine these into a single phase that propagates transformer- 
valued values only. Each summary transformer </)(n) is a partial function: the domain 
of 4>{n) overapproximates the set of reachable states at Sproc(n) from which it is possi- 
ble to reach n; the range of (j>{n) equals JOVP(n), which overapproximates the set of 
reachable states at n. (A two-phase approach a la Sharir and Pnueli [5] could also be 
used^.) 

To simplify the presentation, in §4.1 we will assume that the language does not 
support either local variables or parameter passing. In §4.2, we extend the approach to 
handle local variables and parameters. 



4.1 A Simplified Setting: No Local Variables or Parameters 

To use Eqns. (l)-(4) for interprocedural shape analysis, we follow Observation 1 and 
represent each (j>{n) transformer as a set of 2-vocabulary 3-valued structures. As de- 
scribed below, suitable operations on 3-valued structures provide a way to compose 
such transformers. 

The composition operation (f>{eq) o (f>(call{n)) in Eqn. (4), which represents an 
interprocedural-propagation step, involves transformers represented by two sets of 2 - 
vocabulary 3-valued structures. Intuitively, this involves collecting up a set of structures, 
where each structure is the “natural join” of two structures - one from each argument 
set. Below, we define the operation T 2 o Ti for a single pair, T 2 and T\. 

In fact, to do this really requires three vocabularies: for each original predicate p, 
we use three predicates p[in], and p[tmp], A 2-vocabulary 3-valued structure uses 

only p[in] and p[out] - or rather, the values of the p[tmp] predicates are “irrelevant”. 
(When a predicate p is irrelevant, then p{u) evaluates to 1 /2 for every tuple of indi- 
viduals u.) Another obstacle is to reconcile the values of the predicates in the different 
2-vocabulary 3-valued structures. The solution has several parts: 

^ In the two-phase approach, the first phase is defined by Eqns. (l)-(4), except that the right-hand 
sides of Eqns. (1) and (2) are both replaced by Id. This permits summary transformers to be 
computed in a more modular fashion - i.e., bottom-up over the call graph’s strongly-connected 
components. However, it also causes the analysis to consider more input possibilities for each 
procedure, which is an important consideration in our context. Eqns. (l)-(4) (as written) lead 
to a less modular analysis that requires a fixed-point iteration over the entire program. 
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- We need an operation to move predicates in one vocabulary to predicates in another 
vocabulary. The notation T[tmp ^ out, out 1/2] denotes the (simultaneous) 
transformation on structure T in which the p[out] predicates are moved to p[tmp], 
and the p\oui\ predicates are all set to 1/2. For instance, to perform the composition 
T 2 oTi, we use Ti[tmp ^ out, out ^ 1/2] and T 2 [tmp 4 — in; in ^ 1/2]. 

- We need structures that have the same sets of individuals. Because the individuals 
in 3-valued structures are identified by the values they have for the (unary) ab- 
straction predicates, we use the operation canonical: S 3 p(53), which refines a 
3-valued structure T into a set of structures - each member of which is in the im- 
age of canonical abstraction - such that the set describes the same set of concrete 
structures as T [16]. 

- We define the meet of two 3-valued structures that have the same set of individuals. 

Let Si = {U, ti) and S 2 = {U, L 2 ) be two logical structures with the same universe 
U and vocabulary V. The interpretations l\,L 2 map each relation symbol p G Vk 
to a fc-ary truth-valued function: Li{p) : {0, 1/2, 1}. For convenience, we 

implicitly add a bottom element _L to the lattice ({0, 1, 1/2}, C) of Def. 1. The 
meet operator □ S 2 is defined as 

C n C def f ([/, n n L 2 ) ifnnt2 7^-L 

^ \ T otherwise 

where 

^ f _L ifn(p)(u)ni2(p)(u) = -L 

ti n i 2 = < for some p GVk and u G 

[ Ap G Pk-Xu G .Li{p){u) n i 2 (p)(u) otherwise 

If a predicate is irrelevant in S\, its value in □ S 2 is its value in 82 - 

- We extend the previous definition to any pair of 3 -valued structures by 

Si n 52 = {S[ n Si I S] G canonical{Si) A Si G canonical{S 2 ) A = f/®"} - {_L} 

(5) 

With this notation, the composition of transformers T 20 T 1 , where Ti and T 2 are 2- 
vocabulary 3-valued structures (which are really 3-vocabulary 3-valued structures) is 
expressed as follows: 

T 20 T 1 = (ri[tmp ^ out; out ^ 1/2] □ T 2 [tmp ^ in; in ^ 1/2]^ [tmp ^ 1/2] (6) 

The effect is to perform a natural join on the p[tmp] predicates to create structures that 
have Ti’s p[in] predicates, T 2 ’s p[out] predicates, and common p[fm/7] predicates. The 
p[tmp] predicates are then eliminated by setting them to 1/2 

The composition operation is extended to sets of structures in the usual way: 

SS20SS1 = U{S2 O Si 1 S2 G SS2 A Si G SSi}. 

® A different view of this step is that making the p[tmp] predicates irrelevant corresponds to ex- 
istentially quantifying them out. If expressed by means of a formula, the operation of making 
p[tmp] irrelevant would involve second-order quantification over p [tmp]; however, the opera- 
tion is performed directly on a logical structure, and hence it is not a problem for us that the 
operation cannot be expressed by means of a first-order formula. 
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In contrast, the composition operation Tm,n ° 4>{m) in Eqn. (3), which represents an 
intraprocedural-propagation step, is heterogeneous: Tm,n is defined using a collection 
of predicate-update formulas, c{v\, . . . ,Vk) = . . . ,Vk), whereas is a 

set of 2-vocabulary 3-valued structures. Thus, the composition operation in Eqn. (3) can 
be implemented merely by performing the standard TVLA intraprocedural-propagation 
step for Tm,n on the out predicates (only) for each of the structures in (Note that 

the U operation in Eqn. (3) is union of sets of 3-valued structures. Each r^.n operates 
elementwise on a set of 3-valued structures, and hence distributes over U.) 

In practice, Eqns. (l)-(4) are solved by propagating changes in values, rather than 
full values. Such a differential algorithm is presented in [17]. 

4.2 Local Variables and Parameters 

Until now, we have assumed that a state of a program is defined by a memory con- 
figuration, and that relations between states are represented using structures over dou- 
bled vocabularies. Things are actually a bit more complicated: a state also includes the 
values of local variables, formal input parameters, and formal output parameters. The 
summary transformer </)(n) must thus also relate the values of formal input parameters 
at node Sproc(n) to the state of the heap and the values of local variables at n. 

To incorporate local variables and parameters, we merely have to expand the vocab- 
ulary to Vioc W Vg [in] l±) Vg [out] l±) Vg [tmp] , where the vocabulary Vioc captures Boolean- 
valued and pointer-valued local variables and parameters, and Vg is the tripled vocab- 
ulary from §4.1. The assumption that formal input parameters are not modified in the 
body of a procedure makes it unnecessary to duplicate/triplicate the predicate symbols 
for parameters in Vioc- Eqn. (2) then becomes: 

(j>{sq) = Id\D Sp e StartNodes, p f main, and 

D= y range (rfpip,=,, o f{c))[loc \ {fpij,} ^ 1/2] (7) 

( c, Sp ) G CallToStartEdges 
and the call is y :=p (x) 



where t._. denotes the transformer generated by update formulas that correspond to 
the assignment in the subscript. Eqn. (7) reflects the binding of the actual parameter x 
at node c to the formal input parameter fpi^ at node Sp. All relations corresponding to 
the other local variables and parameters are set to irrelevant at this node. 

For a call statement of the form y : =q (x) , where T 2 = </>(eg) and T\ = (p{call{n)), 
the transformer-composition operation T 2 o Ti used in Eqn. (4) to implement the ab- 
stract procedure-return operation can be expressed as 



/ (rfpi_2, o Ti)[tmp ^ out', out ^ 1 / 2 ] 



T20T1 " 



Ty;=fpo O 



yfpi — fpi^; O T 2 ) 
\ fpo:=fpo 



tmp ^ in; in ^ 1/2; 
loc ^ 1/2 



tmp ^ 1/2; 
{fpi,fpo} ^ 1/2 



(8) 



where fpi and fpo are fresh unary core predicates (not in Vhc or Vg) that are used to 
impose parameter-passing constraints as follows: fpi is bound to the value of the actual 
input parameter x of Ti ; fpi is also bound to the value of formal input parameter fpi^ of 
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T 2 ; and fpo is bound to the value of formal output parameter fpo^ of T 2 - In particular, 
the fpi relation and all of the tmp relations are common in the meet operation performed 
in Eqn. (8). Then, because the local variables in T 2 are set to be irrelevant, the values 
for the local variables in the structures of the answer set are the values from Ti, with the 
exception of the actual output parameter y, which is assigned the value of fpo = fpo^. 

4.3 An Efficient Implementation of the Meet Operation 

There are two sources of comhinatorial explosion in Eqns. (4), (5), (6), and (8): 

1. The number of pairs { 81 , 82 ) G </'(eq) x (f>{call{n)) (quadratic explosion); 

2. The cardinality of the sets canonical{ 8 i) and canonical{ 82 ) in Eqn. (5) defining 
the meet operator □ (exponential explosion). 

Point 1 is inherited from the nature of our abstract lattice, which is a powerset domain, 
and the fact that we apply a binary operation (composition) to values in the domain. We 
do not address this problem here. 

Point 2 is specific to our abstract lattice and concerns only the meet operation, espe- 
cially when it is used to implement relational composition. Consider a pair of 3-valued 
structures T\ and T 2 for which a composition is performed in Eqn. (4). In Ti, the core 
predicates that represent variables of the called procedure q are irrelevant, so they have 
the value 1/2. This means that the operation canonical will enumerate all possible 
definite interpretations for these predicates; the number of these interpretations is expo- 
nential in the number of such predicates. A similar situation holds for T 2 . 

More generally, consider a structure 8 = with n irrelevant unary core 

predicates; the cost of canonical is 0 {{ 2 ^^ I)”). Even if the unary predicates represent 
only pointer-valued variables, which means that such predicates may evaluate to 1 on 
at most one individual, there are still 0(|{7|") possible interpretations. 

In our case, this combinatorial explosion is all the more frustrating because it is 
only temporary: the meet 81 □ 82 will reject (by evaluating to _L) most of the structures 
obtained by enumerating definite interpretations of irrelevant predicates in (resp. 82 )- 
Indeed, predicates that are irrelevant in one structure and relevant in the other usually 
have definite interpretations in the latter. 



A Better Implementation of n. The approach that we actually followed in our ex- 
tended version of TVLA was to implement an approximation to the meet operation 
using systems of 3-valued constraints [1], which were already supported by the base 
TVLA system. In TVLA, there is a global set of constraints Cq that is used to express 
integrity constraints on the set of 2-valued structures that a 3-valued structure repre- 
sents. For instance, some of the constraints in Cq express the fact that a unary predicate 
that represents a pointer-valued variable can evaluate to 1 on at most one individual. 
For convenience, we will associate a constraint set with each structure, so that a 
3-valued structure 8 is now a triple: {U^ , , C^). (C^ is generally Co-) 

A set of constraints C represents the set of concrete structures that satisfy C : 

7c(C) {5 G 52 I 5 h C-} 



(9) 
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in the same way that a 3 -valued structure S represents the set of concrete structures 
7({S'}) that can be embedded into S via canonical abstraction [1]. 

Assume now that we have an operation cons : S 3 p(C) that associates to a given 
structure a set of constraints such that for any S', 7({S'}) C 7c(cons(S)). In other 
words, constraint set cons{S) overapproximates S. For any logical structures Si = 
) and S2 = we now dehne the operation 

Si S2 (C/^% U cons(S2)). 

This operator has the following property: 7({Si S2}) 2 7({Si}) n 7({S2}), with 

equality if the cons operation is exact. 

To summarize, the approximate meet operator consists of adding cons{S 2 ) to Si 
temporarily, then performing Focus and Coerce operations to transfer the information 
that is initially contained in the additional constraints to the universe and the inter- 
pretation . Afterwards, the additional constraints are removed. 

For instance, when we use the meet operation in Eqns. (6) and (8), we replace 
S[ n S '2 in Eqn. (5) by Coerce{Focus{S[ S' 2 , {fpo(n)})). This forces fpo to be given 
a definite interpretation that is constrained by the set cons{S 2 ), which represents the 
summary transformer of the callee. 

Converting a 3-Valued Structure to a Set of Constraints. To achieve this, we adapted 
a result from [18], which shows how to characterize a 3-valued logical structure that is 
in the image of canonical abstraction by means of a formula in first-order logic with 
transitive closure. The resulting formula can easily be converted to a set of constraints 
that satisfy the restricted syntax given in [1]. However, one of the constraints that would 
be generated according to [18] would be too expensive to check from an algorithmic 
point of view, so this constraint is dropped, which induces a safe overapproximation. 
(Roughly, this constraint captures the fact that any concrete structure represented by the 
abstract structure should contain a number of individuals greater than or equal to the 
number of individuals in the abstract structure.) 

5 Implementation and Experiments 

To perform interprocedural shape analysis by the method that is described in §4, we 
created a modified version of TVLA [3], an existing shape-analysis system, to allow it 
to support the following features: 

- We replaced the built-in notion of an intraprocedural CFG by the more general 
notion of equation system. 

- We designed a more general language in which to specify equation systems. 

- We implemented an approximation to the meet operation on 3-valued structures 
(and hence to the composition operation), as described in §4.3. 

Fig. 5 shows how the summary information we obtain captures the behavior of the 
recursive list-reversal procedure of Figs. 1 and 2. The descriptor of the initial summary 
transformer at start node Smain was the 3-valued structure Sq, shown in Fig. 5(a), which 
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Fig. 5. List-reversal example. (In each structure, unary predicates that have the same non-0 value 
for all individuals are displayed in the box labeled “Const, unary predicates”. The values of the 
“irrelevant” predicates of the vocabulary are not shown.) 



represents (the identity transformation on) all linked lists of length at least two that 
are pointed to by program- variable list. The head of the answer list is pointed to 
by program-variable res. At the program’s exit node emam, the summary transformers 
were the structures Si and S2 of Fig. 5 (b)-(c), which represent the transformations that 
reverse lists of length two, and all lists of length greater than two, respectively. 

As discussed in § 3 , to prevent the loss of essential information, several families of 
instrumentation predicates were introduced: 

- The unary predicates id^ucc[n, mi, m2] and id^red[n, mi, m2], where mi, m2 C 
{in, out} and mi ^ m2, record information about the values of different modes 
of predicate n, in particular, whether the value of predicate n[mi] implies n[m2j. 
These are defined by 

idjucc[n,mi,m2]{v) = Vui : {n[mi]{v,vi) n[m2]{v,vi)) 
id^red[n,mi,m2]{v) = Vui : {n[mi]{vi,v) n[m2]{vi,v)). 

The fact that idsucc[n, in, out]{v) A idsucc[n, out, in]{v) A id-pred[n, in, out]{v) A 
id-pred[n, out, m](u) holds globally in S'o {cf. Fig. 5 (a)) captures the condition that 
the n[in\ and n\out] predicates are identical at the entry node of the procedure. The 
n\in\ predicates serve as an indelible record of the state of the n-links at the entry 
node. 

- The unary predicates reverse -nsucc\mi, m2], again with mi, m2 G {in, out} and 

mi ^ m2, record whether n[m2] is an inverse of These are defined by 

reverse juucc[mi,m2]{v) = Vui : {n[mi]{v,vi) ^ n[m2]{vi,v)). 

The values for fhese predicates in Si and S2 show thaf for each n-link n[in] (ui , V2) 
at the entry node we have an n-link n\out]{v2,vi) at the exit node In 
other words, the procedure has reversed all the n-links. 
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In addition, during the composition operation, some additional constraint rules were 
needed for the system to be able to deduce a relationship between n[in] and n[oui\. 
These are defined by 

id^ucc[n, in, tmp]{v) A reverse -njucc[tmp,out]{v) reverse jisucc[in,out]{v) 

reverse ji-succ[in,tmp]{v) A id 4 >red[tmp,out]{y) reverse-n^ucc[in, out]{v) 



Notice that only the reverse ji_succ[mi, m 2 ] predicates and the related constraint rules 
are particular to the list-reversal example. The other predicates that appear in Fig. 5 were 
already used in previous papers on shape analysis of list-manipulation programs (see 
[1]): for instance, r[n, out, list]{v) holds the value 1 for individuals that are reachable 
from variable list through a chain of n[out] links. From the above definitions of the 
instrumentation predicates, it should be clear that the set of 3-valued structures { S'! , S '2 } 
accurately captures the fact that the output list is the reversal of the input list, and that 
the result is a list of length at least two. 

Our second experiment in- 



volved comparing our results with 
[11] on the following examples: 
(i) list reversal (as discussed 
above), and (ii) non-deterministic 
insertion and deletion of a cell in 
a list. Results are shown in Fig. 6. 
Our method performs better than 



Program 


Our method 
# of Time Space 
structs. (sec) (Mb) 


Method of 
# of Time 
structs. (sec) 


[11] 

Space 

(Mb) 


reverse 


7/3 


11 


26 


•/3 


37 


17 


insert 


23/9 


188 


43 


■19 


70 


18 


delete 


32/13 


222 


43 


■ n 


47 


17 


tree exch. 


22/10 


92 


33 





that of [11] for the list-reversal The experiments were performed on a PC equipped with 



program, but worse for the latter 
two programs. For those, we con- 
sidered programs where the cell to 
be inserted is passed as an input 
parameter (in the insert example), 
and the deleted cell is received 
back as an output parameter (in 



a 2 GHz Pentium 4 processor and 768 Mb of mem- 
ory. Time and Space information were obtained with the 
t ime and top commands. The two numbers in each en- 
try of the columns labeled # of structs. give the number 
of structures for the summary transformer of the recur- 
sive procedure and the number of structures at the end 
of the main procedure, respectively. 



the delete example); this provides e-.cc » 1 c 

r r Pig. 6. Experimental results. 

information about where the cell 



has been inserted (resp. deleted). 

For the versions of the programs analyzed by the method of [11], we added a global 
variable cell, which plays a similar role. 

Concerning the slower computation times, we think they are mainly due to the 
higher number of predicates to be manipulated (because of the different modes) and 
the cost of the meet operation. However, it is important to keep in mind that our method 
computes a summary transformer for each procedure, which [11] does not. The sum- 
mary transformer (j>{eq) at an exit node Cg is a partial function: the domain of (f>{eq) 
overapproximates the set of reachable states at Sproc(e,) from which it is possible to 
reach e^; the range of 4>{eg) overapproximates the set of reachable states at Cg. This has 
an impact on the results: for the delete example, the method of [1 1] is not able to keep 
track of the original position in the list of the deleted cell, unlike our method. (For the 
insert example, however, the two methods are similar w.r.t. this kind of information.) 
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Our third experiment was to analyze a procedure that recursively exchanges the 
right and left subtrees of a binary tree. This example is interesting because it would 
be difficult to implement this operation as a non-recursive procedure. The analysis was 
able to establish that after the procedure finishes, fhe subfrees of all cells reachable from 
the root have been exchanged, whereas the other cells have not been modified. 

Sfafisfics are given in Fig. 6. More information about the experiments is available 
at http://www.irisa.fr/prive/bjeannet/interproctvla/interproctvla.html. 

6 Related Work 

The analysis described in this paper uses 3-valued structures over a doubled vocabu- 
lary. A similar approach is standard when concrete transition relations are expressed by 
means of formulas. For instance, the semantics of a statement x : = y-i-1 can be ex- 
pressed as (a;' = y -f 1) A (y' = y). Statements such as x : = y+1 can be transformed 
into composable abstract transformers for programs that manipulate numeric data, using 
several numeric lattices (e.g., polyhedra [19], octagons [20], etc.). In contrast. Obser- 
vation 1 provides a way to create composable abstract transformers for the analysis of 
programs that support both dynamically-allocated storage and destructive updating of 
pointer-valued fields of structures. A key feature of the approach is that instrumentation 
predicates can refer to both the V\in\ and V[oui\ vocabularies. For instance, the family 
of unary predicates reverse Jisucc[mi^m 2 \ discussd in §5 (with mi, m 2 & {in, out} 
and mi ^ m 2 ) records whether n[m 2 ] is an inverse of n[mi]. 

As discussed in the introduction, interprocedural shape analysis was also studied in 
[11]. The approach used in the present paper was inspired by the functional approaches 
of [4-6]. In contrast, the approach used in [1 1] is more reminiscent of the “call-strings” 
approach of [5]. 

A method for performing interprocedural shape analysis using procedure specifica- 
tions and assume-guarantee reasoning is presented in [16]. There it is assumed that 
a specification for each procedure - a pre- and post-condition - is already known; 
the technique presented in [16] can be used to interpret a procedure’s pre- and post- 
condition in the most precise way (for a given abstraction). For every procedure invoca- 
tion, one checks if the current abstract value potentially violates the precondition; if it 
does, a warning is produced. At the point immediately after the call, one can assume that 
the post-condition holds. Similarly, when a procedure is analyzed, the pre-condition is 
assumed to hold on entry, and at end of the procedure the post-condition is checked. The 
work described in the present paper is complementary to [16]: the work described here 
- particularly in the modified form skefched in footnofe 5 - provides a way fo identify 
procedure specifications (in the form of sets of 2-vocabulary 3-valued structures) that 
can be used with the method from [16]. 

A second connection is that [16] provides a method to compute the most-precise 
overapproximation of the meet of two abstract values, which is the operation needed 
for composing transformers that are expressed as sets of 2-vocabulary 3-valued struc- 
tures (see Eqns. (6) and (8)). Consequently, [16] provides a more precise alternative to 
the approximate meet operation described in §4.3. (At present, implementations of the 
methods from [16] are based on theorem provers, and are much slower than the method 
from §4.3, which does not involve a theorem proven) 
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Abstract. One of the continuing challenges in abstract interpretation is the cre- 
ation of abstractions that yield analyses that are both tractable and precise enough 
to prove interesting properties about real-world programs. One source of diffi- 
culty is the need to handle programs with different behaviors along different ex- 
ecution paths. Disjunctive (powerset) abstractions capture such distinctions in a 
natural way. However, in general, powerset abstractions increase space and time 
costs by an exponential factor. Thus, powerset abstractions are generally per- 
ceived as very costly. 

In this paper, we partially address this challenge by presenting and empirically 
evaluating a new heap abstraction. The new heap abstraction works by merging 
shape descriptors according to a partial isomorphism similarity criteria, resulting 
in a partially disjunctive abstraction. 

We implemented this abstraction in TVLA - a generic system for implementing 
program analyses.We conducted an empirical evaluation of the new abstraction 
and compared it with the powerset heap abstraction. The experiments show that 
analyses based on the partially disjunctive heap abstraction are as precise as the 
ones based on the powerset heap abstraction. In terms of performance, analyses 
based on the partially disjunctive heap abstraction are often superior to analyses 
based on the powerset heap abstraction. The empirical results show consider- 
able speedups, up to 2 orders of magnitude, enabling previously non-terminating 
analyses, such as verification of the Deutsch-Schorr- Waite scanning algorithm, 
to terminate with no negative effect on the overall precision. Indeed, experience 
indicates that the partially disjunctive shape abstraction improves performance 
across all TVLA analyses uniformly, and in many cases is essential for making 
precise shape analysis feasible. 



1 Introduction 

One of the continuing challenges in abstract interpretation [3] is the creation of abstrac- 
tions that yield analyses that are both tractable and precise enough to prove interesting 
properties about real-world programs. In this paper we partially address this challenge 
by presenting and empirically evaluating a new heap abstraction, i.e., an abstraction for 
the (potentially unbounded) dynamically allocated storage manipulated by programs 
(e.g., see [7, 9, 2, 8, 16, 14, 15]). Heap abstractions are of fundamental importance to 
static analysis and verification of programs written in modern languages. Heap abstrac- 
tions have been used, for instance, in the context of shape analysis (e.g., for proving that 
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a program fragment preserves certain tree structure invariants), as well as in verifying 
that a client program satisfies certain conformance constraints for the correct usage of 
a library. 

We present our abstraction in the context of the parametric abstract interpretation 
framework of [15], which is based on the idea of representing program states using 3- 
valued logical structures. While it is very natural to view the abstraction we present as 
a heap abstraction, it can be used for abstracting other domains as well. 

The TVLA framework presented in [15] uses a disjunctive (powerset) heap abstrac- 
tion: the abstract value at every program point is a set of shape descriptors (of bounded 
size) and set nnion is used as the join operation. In particular, this abstraction does not 
attempt to combine (or merge) different shape descriptors into one and relies on the fact 
that there are only finitely many shape descriptors (as they are of bounded size). This 
leads to powerfnl and sophisticated analyses for proving interesting program proper- 
ties but is usually too expensive to be applied to real-world programs. (The number of 
distinct shape descriptors is doubly exponential in the size of the program in the worst 
case.) 

The heap abstractions most commonly used in practice, especially when scalabil- 
ity is important, tend to be single-shape heap abstractions, which nse a single shape 
descriptor to describe all possible program states at a program point [9,2, 14]. The 
cnrrent TVLA implementation provides options to utilize such single-shape heap ab- 
stractions. However, our experience has been that for the kind of applications that we 
have nsed TVLA for (mostly verihcation problems), the single-shape abstraction tends 
to be imprecise and causes a number of “false alarms” (i.e., verification fails for cor- 
rect programs). Hence, this abstraction is not widely used by TVLA users. (A detailed 
discussion of the single-shape abstractions is beyond the scope of this paper, because 
of the complexity of formalizing the single-shape abstractions within the framework of 
3-valued-logic.) 

This paper presents a partially disjunctive heap abstraction which, in our experi- 
ence, is significantly more efficient than the powerset heap abstraction, but has turned 
out to be precise enough for all the applications we have experimented with. Indeed, 
this abstraction has turned out to be the abstraction of choice for all TVLA users. The 
main idea behind this abstraction is to reduce the set of shape descriptors arising at a 
program point by merging “similar” shape descriptors but keeping “dissimilar” shape 
descriptors apart. 

1.1 Running Example 

Figure 1 shows a method implementing the mark phase of a mark-and-sweep garbage 
collector. The challenge here is to show that this procedure is partially correct, i.e., to 
establish that “upon termination, an element is marked if and only if it is reachable from 
the root.” This simple program serves as a running example in this paper. The partial 
correctness of this program was established using abstract interpretation in [13]. This 
abstract interpretation was created using TVLA - a generic system for implementing 
program analyses [10]. The default Implementation of TVLA uses the powerset heap 
abstraction. Verification of the above property using the powerset heap abstraction took 
584 cpu seconds and generated 189, 772 different shape descriptors - dehnitely too 
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// ©Ensures marked == REACH (root) 
void mark (Node root, NodeSet marked) { 

Node x; 

if (root != null) { 

NodeSet pending = new NodeSetO; 

pending . add (root) ; 

marked . clear ( ) ; 

while (! pending . isEmpty 0 ) { 

X = pending . selectAndRemove 0 ; 
marked . add (x) ; 
if (x.left != null) 

if ( Imarked. contains (x. left) ) 
pending . add (x . left ) ; 
if (x. right != null) 

if ( Imarked. contains (x. right) 
pending . add (x . right ) ; 

} 

} 

} 

Fig. 1. A simple Java-like implementation of the mark phase of a mark-and-sweep garbage col- 
lector 

many for such a simple program and simple property. The situation is worse for veri- 
fying a similar property for an implementation of the Deutsch-Schorr- Waite scanning 
procedure [11], This verification took 4 hours when the powerset heap abstraction was 
used. 

Powerset heap abstractions are costly since they may distinguish between too many 
shape descriptors, which may not be necessary in order to verify program properties. In 
this paper, we define a partially disjunctive heap abstraction, which is coarser than the 
powerset heap abstraction. The main idea is to reduce the set of shape descriptors arising 
at a program point by merging “similar” shape descriptors. In the mark example, verifi- 
cation using the partially disjunctive heap abstraction took 3 cpu seconds and generated 
1, 133 shape descriptors - a two orders of magnitude improvement over verification 
using the powerset heap abstraction - with the same precision. Similarly, the verifica- 
tion of an implementation of the Deutsch-Schorr- Waite scanning procedure terminated 
successfully in 158 cpu seconds using the partially disjunctive heap abstraction. 

1.2 Main Results 

A New Abstraction. We dehne a new heap abstraction, which we refer to as the 
partial-isomorphism heap abstraction. The new abstraction is coarser than the pow- 
erset heap abstraction and yet keeps certain shape descriptors apart. Our abstraction is 
parametric. It allows the user to specify which heap properties are of importance for a 
given analysis, and this guides the abstraction in determining which shape descriptors 
are merged together. 

Robust Implementation. We implemented our abstraction in TVLA. This abstraction 
has turned out to be the abstraction of choice for all TVLA users (e.g., see [19]). We 
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believe that it is simple enough to be implemented in other systems besides TVLA 
(e.g., [17]). 

Empirical Evaluation. We empirically evaluated our abstraction by comparing it with 
the powerset heap abstraction. In the largest benchmark, SQLExecuter, powerset 
heap abstraction did not terminate within 20, 000 cpu seconds. In contrast, the new 
abstraction took 9, 673 cpu seconds and proved correct usage of JDBC objects and 
absence of null-dereferences. 

1.3 Outline 

In Section 2, we give an overview of 3-valued-logic based program analysis. In Sec- 
tion 3, we describe the partial-isomorphism heap abstraction. In Section 4, we provide 
an empirical evaluation of the partial-isomorphism heap abstraction and powerset heap 
abstraction. In Section 5, we outline several other heap abstractions that we are investi- 
gating as ongoing work. In Section 6, we discuss related work. 

2 3 -Valued Shape Analysis Primer 

We now present an overview of first order transition systems (FOTS), the formalism 
underlying the parametric analysis framework of [15]. FOTS may be thought of as an 
imperative language built around an expression sub-language based on first-order logic 
with transitive closure. 

Concrete Program Configurations 

In FOTS, program states are represented using 2-valued logical structures. 

Definition 1. A 2-valued logical structure over a set of predicates P is a pair = 
{U\ /^) where: 

— [ 7 ^ is the universe of the 2-valued structure. 

— is the interpretation function mapping predicates to their truth-value in the struc- 
ture: for every predicate p G P ofarity k, (p) : U'^ {0, 1}. 

In the context of shape analysis, a logical structure is used as a shape descriptor, with 
each individual corresponding to a heap-allocated object and predicates of the structure 
corresponding to properties of heap-allocated objects. 

In the following, we use as alternative notation for /^(p)(u), omitting the 

superscript , when no confusion is likely. We denote the set of all 2-valued logical 
structures over a set of predicates P by 2-STRUCTp. We will mostly assume that the 
set of predicates P is fixed and abbreviate 2-STRUCTp to 2-STRUCT. 

Table 1 shows the predicates used to record properties of individuals for the analysis 
of our running example. A unary predicate refiy) holds when the reference (or pointer) 
variable ref points to the object u; in our example ref G {x, roof}. Similarly, a binary 
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Table 1. Predicates used to verify the running example 



Predicates 



Intended Meaning 



x(v) Does reference variable x point to object u? 

root{v) Does reference variable root point to object u? 

lef t(ui, U 2 ) Does field lef t of object vi point to object U 2 ? 

right(ui, W 2 ) Does field right of object vi point to object 112 ? 
r[root]{v) Is object v heap-reachable from reference variable root? 

setlmarked](v) Is object v a member of the marked set? 
set[pending\(v) Is object v a member of the pending set? 



predicate fld{v\,V 2 ) records the value of a reference (or pointer- valued) field fid; 
in our example f Id G {left, right}. A unary predicate sef[s](u) holds when the 
object V belongs to the set s; in our example s G {marked, pending} . 

In this paper, program configurations (i.e., 2-valued logical structures) are depicted 
as directed graphs. Each individual of the universe is drawn as a node. A unary pred- 
icate p{u), which holds for a node u, is drawn inside the node u. If a unary predicate 
represents a reference variables it is shown by having an arrow drawn from its name to 
the node pointed by the variable. A binary predicate p{ui,U 2 ) which evaluates to 1 is 
drawn as directed edge from ui to M 2 labelled with p. 

Figure 2(a) shows a concrete configuration arising at the exit label of the mark pro- 
cedure, where all the individuals that are reachable from root are marked, as indicated 
by the value of the set[marked] predicate. The individuals represented by the empty 
nodes correspond to garbage objects. 

Operational Semantics 

In FOTS, program statements are modelled by actions that specify how statements 
transform an incoming logical structure into an outgoing logical structure. This is done 
primarily by defining the values of the predicates in the outgoing structure using first- 
order logical formulae with transitive closure over the incoming structure [15]. 

Abstract Program Configurations 

We now describe the abstractions used to create a finite (bounded) representation of 
a potentially unbounded set of 2-valued structures (representing heaps) of potentially 
unbounded size. The abstractions we use are based on 3-valued logic [15], which ex- 
tends boolean logic by introducing a third value 1/2, denoting values that may be 0 or 
1. In particular, we utilize the partially ordered set {0, 1, 1/2} with the join operation 
U, defined hy xUy = x if x = y and 1 /2 otherwise. 

Definition 2. A 3-valued logical structure over a set of predicates P is a pair C = 
{U, I) where: 

— U is the universe of the 3-valued structure. 

— I is the interpretation function mapping predicates to their truth-value in the struc- 
ture: for every predicate p G P ofarity k, I{p) : (0, 1, 1/2}. 
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Fig. 2. (a) A concrete program configuration arising at the exit label of the mark procedure, where 
all non-garbage nodes have been marked; (b) An abstract program configuration that approxi- 
mates the concrete configuration in (a) 



A 3-valued logical structure can be used as an abstraction of a larger 2-valued logical 
structure. This is achieved by letting an abstract configuration (i.e., a 3-valued logical 
structure) include summary individuals, i.e., an individual which corresponds to one or 
more individuals in a concrete configuration represented by that abstract configuration. 
In the rest of the paper, we assume that the set of predicates P includes a distinguished 
unary predicate sm to indicate if an individual is a summary individual. 

In this paper, 3-valued logical structures are also depicted as directed graphs, where 
binary predicates with 1 /2 values are shown as dotted edges and summary individuals 
are shown as double-circled nodes. 

We denote the set of all 3-valued logical structures over a set of predicates P by 
3-STRUCT p, usually abbreviating it to 3-STRUCT. We define a preorder on sfrucfures, 
denoted by C, based on the concept of embedding. 

Definition 3. Let S and S' be two structures and let / : be surjective. Wfe 

say that f embeds S in S' (denoted by S C-f S') if (i) for every predicate p (including 
sm) ofarity k, and every k — tuple of individuals u\, . . . , rtfe G , 

p^{ui,... ,Uk) ^p^\f{ui),... ,f{uk)) (1) 

and (ii) for all u' S 

(|{^^ I f{u) = m'}| > 1) E sm^ {u') (2) 

We say that S can be embedded in S' (denoted by S IL S') if there exists a function 
f such that S E'^ S'. 
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Bounded Program Configurations 

Note that the size of a 3-valued structure is potentially unbounded and that 3-STRUCT 
is infinite. The abstractions studied in this paper rely on a fundamental abstraction func- 
tion for converting a potentially unbounded structure (either 2-valued or 3-valued) into 
a bounded 3-valued structure, which we define now. This abstraction function atiurlA] 
is parameterized by a special set of unary predicates A referred to as the abstraction 
predicates. 

Let A be a set of unary predicates. An individual ui in a structure S'! is said to 
be A-compatible to an individual U2 in a structure S2 iff for every predicate p G 
A, C p^^{u2) or p^^{u2) E (Recall that the partial order C on 

{0, 1, 1/2} is defined hy x Q y iff x = y or y = 1 /2.) 

A 3-valued structure is said to be A-bounded if no two different individuals in its 
universe are A-compatible. A structure that is A-bounded can have at most 2l'^l individ- 
uals. We denote the set of all 3-valued A-bounded structures over a set of predicates by 
B-STRUCT P A, and, as usual, omit the subscripts when no confusion is likely. 

The abstraction function f3uur ■ 3-STRUCT ^ B-STRUCT, which converts a (po- 
tentially unbounded) 3-valued structure into a bounded 3-valued structure, is defined 
as follows: we obtain an A-bounded structure from a given structure by merging all 
pairs of A-compatible individuals. f3biur{{Ui, I)) = {U2, J), where U2 is the set of 
A-compatible equivalence classes of U\, and the interpretation J is defined by: 

P (Cl , . . . , C/j;) p^{ui, ... ,Uk) forp^ sm 

sm'^{c) =1/2 if |c| > 1 

sm'^{c) = sm^{u) if c = {m| . 

Figure 2(b) shows an A-bounded structure obtained from the structure in Figure 2(a) 
with A = {x, root, r[root], set[marked], set[pending]} . 

The abstraction function (3uur serves as the basis for abstract interpretation in 
TVLA. In particular, it serves as the basis for defining various different abstractions 
for the (potentially unbounded) set of 2-valued logical structures that arise at a program 
point. 

2.1 Powerset Heap Abstraction 

This abstraction is based on the fact that there can only be a finite number of bounded 
structures that are not isomorphic to one another. (Two structures are isomorphic when 
there is a bijection between their universes that preserves all predicate values.) The 
powerset abstraction function operates by bounding 2-valued structures with respect to 
a subset of the unary predicates, and removing duplicates (isomorphic structures). 

For the sake of simplicity we will work with canonic bounded structures. Note that 
the individuals of an A-bounded structure are uniquely identified by the set of values of 
the predicates in A; we refer to such a set of predicate values as the individual’s canon- 
ical name. For example, the individual pointed by root in Figure 2(b) has the canoni- 
cal name [772^7. /j;e(i]—i,sei[pen(im5]—o} ■ ^ canonic bounded struc- 

ture is a bounded structure in which the individuals are identified by their canonical 
names. We refer to the set of all canonic bounded structures by CB-STRUCTp Note 
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that for a given P and A, CB-STRUCTp A is finite. The canonic abstraction function 
Pcanonic ■ 2-STRUCT ^ CB-STRUCT is defined as follows: Pcanonic(S) is obtained 
by renaming the individuals of Puur{S), giving them canonic names. 

The powerset heap abstraction function apow ■ ^ 2 *^®-struct 

hned by 

^pow(^^) — '{/^canonic (S') I S' G XSj . 

3 The Partial-Isomorphism Heap Abstraction 

The idea behind partial-isomorphism heap abstraction is fairly simple. The powerset 
heap abstraction keeps all the canonic bounded structures that arise at a program point 
separate. Single-shape heap abstraction merges all canonic bounded structures arising 
at a program point into one structure. The partial-isomorphism heap abstraction, in 
contrast, merges canonic bounded structures into one structure only when they have the 
same universe. 

We say that a pair of canonic bounded structures are universe congruent iff the two 
structures have the same universe. Universe congruence induces an equivalence relation 
over sets of canonic bounded structures. This equivalence relation lets us define an 
abstraction function api : ^ 2 Cb-struct ^11 universe congruent 

structures. Given a set of canonic bounded structures XS with the same universe U, we 
dehne the merged structure = {U, I) that has the same universe as all structures 
in XS and the following interpretation of predicates. For every predicate p of arity k and 
tuple of individuals {ui, . . . , Uk) G 

,Uk) = y . . . ,Ufc) . 

s&xs 

We are now ready to dehne the partial-isomorphism heap abstraction function apP. 
api{XS) = || I G I C C apow{XS) is a universe congruence equivalence class! . 



Thus, partial-isomorphism heap abstraction is less precise than the powerset heap 
abstraction'. As the empirical results presented later show, the partial-isomorphism 
heap abstraction seems to work as well as (i.e., is as precise as) the powerset heap 
abstraction, in practice. The following propositions may help explain why. 

Proposition 1. If a pair of bounded structures Si and S 2 are universe congruent, then 
the merged structure iSi |J S'2 is the least bounded structure that approximates ( embeds ) 
both S\ and 82 - 

When partial-isomorphism abstraction is applied to a pair of structures S\ and S 2 , 
there are two possibilities: 

* Here, precision is used in the sense of a Galois Connection between a pair of abstract domains. 
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- Structures Si and S 2 are not universe congruent. In this case, the result of the 
abstraction is ctpidS'i, S' 2 }) = {5'i,S'2}, which is the least upper-bound of the 
powerset abstraction - the most precise approximation of both structures. 

- Structures Si and S 2 are universe congruent. In this case, the result of the abstrac- 
tion is Qfpdl'S'i, <S' 2 }) = <S'i U which is the most precise upper bound among all 
(singleton sets of) bounded structures. 

Proposition 2. Partial-isomorphism heap abstraction preserves the values of abstrac- 
tion predicates. 

In other words, partial-isomorphism heap abstraction only loses the same kind of dis- 
tinctions that can also be lost by !3uur - values of non-abstraction predicates. 

In terms of worst-case complexity, partial-isomorphism heap abstraction has the 
same complexity as powerset heap abstraction - doubly-exponential in the number of 
abstraction predicates. This is due to the number of sets of canonical names, which is 
the dominant factor in the worst-case complexity. However, partial-isomorphism heap 
abstraction can save an exponential factor due to binary predicates, which is the domi- 
nant factor in many cases, in practice. 

3.1 Illustrating Example 

To illustrate the operation of partial-isomorphism heap abstraction, consider the abstract 
program configuration shown in Figure 2(b) and the abstract program configuration 
shown in Figure 3(a). Both conhgurations represent cases where all of the non-garbage 
nodes have been marked and non-garbage nodes have not been marked, i.e., the program 
property we want to verify holds for those configurations. The difference between the 
conhgurations is in the position of the node pointed by x in the part of the heap that 
has been marked. In this case, the partial-isomorphism heap abstraction results in the 
structure shown in Figure 3(b), which ignores the precise position of the node pointed 
by X inside the part of the heap that was marked. 

The mark program non-deterministically selects an object and removes it from the 
pending set. This non-determinism allows many different ways of traversing the set of 
objects reachable from root, which results in many different abstract program conhg- 
urations that sustain the program property we want to verify and only differ by values of 
binary predicates. Partial-isomorphism heap abstraction ignores the values of the binary 
predicates, but keeps precise the overall property for an abstract conhguration of having 
sets of nodes with the same garbage/non-garbage and mark/unmarked properties. This 
allows the analysis to merge many similar structures without losing the information 
needed to prove the partial correctness of the mark program. 

4 Implementation and Empirical Evaluation 

We implemented the partial-isomorphism abstraction described in the previous section 
in TVLA, and the implementation is publicly available [10]. We applied it to verify 
various specihcation for the Java programs described in Table 2. To translate Java pro- 
grams and their specihcations to TVP (TVLA’s input language), we used a front-end 
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Fig. 3. (a) An abstract program configuration arising at the exit label of the mark procedure, where 
all non-garbage nodes have been marked and x points to a node adjacent to root; (b) The result 
of merging the structure in (a) and the structure in Figure 2(b) 



Table 2. Benchmarks and properties used for comparing the analysis based on powerset heap 
abstraction with the analysis based on partial-isomorphism heap abstraction. Treeness means 
preservation of tree structure invariants 



Benchmark 

GC.mark 

DSW 

ISPath 

InputStreamS 

InputStreamSb 

InputStreamb 

SQLExecutor 

KemelBench.l 

InsertSorted 

DeleteSorted 



Description 

Figure 1 

Deutsch-Schorr-Waite 

Input streams 

Input stream holders 

Input stream holders with error 

Input stream holders 

A JDBC framework 

CMP benchmark [12] 

Insertion into sorted trees 
Deletion from sorted trees 



Property 

Partial correctness 

Partial correctness of tree scanning -l- Treeness 

Correct usage of Java lOStreams 

Correct usage of Java lOStreams 

Correct usage of Java lOStreams 

Correct usage of Java lOStreams 

Correct usage of JDBC objects 

Absence of concurrent modification exceptions 

Tree sortedness -l- Treeness 

Tree sortedness 



for Java, which is based on the Soot framework [18]. For all benchmarks, we checked 
the absence of null dereferences in addition to the properties described in Table 2. Our 
specifications include correct usage of JDBC objects, correct usage of Java FO streams, 
correct usage of Java collections and iterators, and additional small but interesting spec- 
ifications. 

The experiments were conducted using TVLA version 2, running with SUN’s JRE 
1 .4, on a 1 GHZ Intel Pentium Processor machine with 1 . 5 GB RAM. We optimized for 
precision and simplicity by using TVLA’s Focus and Coerce operations in all bench- 
marks. We compared partial isomorphism to the full powerset abstraction in terms of 
time and space performance and precision. 

The results of the analyses are shown in Table 3. In all the benchmarks the analysis 
based on the partial-isomorphism heap abstraction achieved the same precision as the 
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Table 3. Time, space and number of errors measurements. Rep. Err. is the number of errors re- 
ported by the analysis, and Act. Err. is the number of errors that indicate real problems. Time and 
space measurements for non-terminating benchmarks are prefixed with > to indicate the mea- 
surements taken when the analysis timed out. The number of reported errors is the same for both 
the analysis based on the powerset heap abstraction and the analysis based on partial-isomorphism 
heap abstraction on all (terminating) benchmarks. Eor benchmarks that did not terminate with the 
powerset heap abstraction, the numbers are taken from the analysis based on partial-isomorphism 
heap abstraction 



Benchmark 


Time in seconds 


Space in Mb. 


Rep. Err. / Act. Err. 


Powerset 


Partial iso. 


Powerset 


Partial iso. 


GC.mark 


584 


3 


56 


1.4 


0/0 


DSW 


14,364 


157 


116.3 


5.6 


0/0 


ISPath 


79 


79 


2.8 


2.9 


0/0 


InputStreamS 


4,530 


1,706 


14.0 


11.9 


1/0 


InputStream5b 


3,492 


1,394 


9.8 


9.1 


1/0 


InputStreamb 


15,558 


3,929 


23.6 


15.9 


1/0 


SQLExecutor 


>20,000 


9,673 


>109.3 


104.8 


0/0 


KernelBench. 1 


7,393 


5,355 


13.3 


10.8 


1/1 


InsertSorted 


264 


37 


4.5 


2.4 


0/0 


DeleteSorted 


>20,000 


3,271 


>62.6 


21.8 


0/0 



analysis based on the powerset heap abstraction, and other TVLA users reported the 
same phenomena. In all but one example, the analysis based on partial-isomorphism 
heap abstraction achieved significant performance improvements. 

4.1 Implementation Independent Results 

Although the results shown in Table 3 measure the time and space consumption of anal- 
yses using different abstractions, they are also influenced by the various implementation 
details of the abstractions. 

In Table 4, we supply implementation independent measurements. We measured the 
total number of abstract configurations generated by the analysis and the maximal num- 
ber of abstract configurations that exist in the transition system at any given time during 
the analysis. The total number of abstract configurations and the maximal number of 
abstract configurations are always the same with the powerset heap abstraction, since 
structures are only accumulated in the transition system. For the partial-isomorphism 
heap abstraction, the maximal number of abstract configurations is often lower than the 
total number of abstract configurations, indicating that structures discovered in different 
iterations were merged together. 

The results show a consistency between the improvements in time and space per- 
formance of the partial-isomorphism heap abstraction, relative to the powerset heap 
abstraction, and the reduced number of abstract configurations. 

5 Extensions and Future Work 

The partial-isomorphism heap abstraction has so far performed quite satisfactorily in 
our experience with TVLA. However, we cannot assume that this will always be ad- 
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Table 4. Implementation independent measurements. Total #structs is the total number of abstract 
configurations that arose during the analysis, and Max #structs is the maximal number of abstract 
configurations that existed in the transition system at any time during the analysis. The results of 
non-terminating benchmarks are prefixed with > to indicate the measurements taken when the 
analysis timed out 



Benchmark 


1 Total #structs 


1 Max #structs | 


Powerset 


Partial iso. 


Powerset 


Partial iso. 


GC.mark 


189,772 


1,133 


189,772 


748 


DSW 


320,387 


6,480 


320,387 


2,986 


ISPath 


2,168 


2,168 


2,168 


2,168 


InputStreamfi 


8,164 


3,366 


8,164 


2,204 


InputStreamfib 


5,973 


2,598 


5,973 


1,729 


InputStreamb 


24,461 


6,678 


24,461 


4,411 


SQLExecutor 


>8,824 


4,107 


>8,824 


2,164 


KernelBench. 1 


12,594 


9,296 


12,594 


5,748 


InsertSorted 


7,487 


1,318 


7,487 


905 


DeleteSorted 


>158,780 


30,386 


>158,780 


25,673 



equate. Analysis and verification of larger programs may require more aggressive ab- 
stractions, while in some cases we may require more precise abstractions. In this section 
we describe various other abstractions that may be of value. We are currently in the pro- 
cess of evaluating the effectiveness of some of the abstractions described below. 

Parametric Partial Isomorphism 

We now present a parametric abstraction that includes both the powerset heap abstrac- 
tion and the partial-isomorphism heap abstraction as special cases. 

Definition 4 . We say that a pair of bounded structures S\ = {U\,Ii) and S2 = 
(C/2) ^2) are partially isomorphic with respect to a set of predicates R, denoted by 
Si =R S2, iff there exists a bijection /p® : U\ U2, such that, for every predicate 
p G R ofarity k and tuple of nodes {u \, . . . , Uk) G Ui, the following holds: 

p^^{ui,... ,Uk) =p^‘^{fP\ui),... ,r\uk)) ■ 

Note that =n is an equivalence relation among 3 -valued structures. Given any set 
of predicates R that includes the set of all abstraction predicates A, we define an ab- 
straction function ^ 2 '^®'StRuct follows: 

{U C I C G oipQ^ is a — equivalence class^ 

This function defines a whole family of abstractions. Further, apow = Qfpqpj (where P 
is the set of all predicates) is the most precise among this family of abstractions, and 
o-pi = ctpHA] is the least precise among this family of abstractions. 

The reason we restrict ourselves to sets R that contain the set of all abstraction 
predicates A is the following. If R includes A, then for any two =p-equivalent bounded 
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structures, the bijection between the universes of the two structures that preserves the 
values of predicates in R is uniquely determined, and this bijection is used to determine 
which individuals should be “merged” together. 

This parametric definition allows users to choose abstractions in a more fine-grained 
fashion, by specifying the set of predicates R. The parametric abstraction could also be 
used by an appropriate iterative refinement technique, which starts with R = A and 
iteratively adds predicates to R, until a sufficiently precise abstraction is obtained or 
R = P. 



Deflating Reductions 

Deflating reductions can potentially yield performance improvements without a loss 
of precision. A very simple deflating reduction is the following: consider a set of 3- 
valued structures X containing structures S\ and S' 2 , such that C 82 - Clearly, the 
set X' = X — {S'!} is semantically equivalent to X, and removing Si involves no 
loss of precision (even when the abstract transformer that is used is not the best). This 
reduction is referred to as “non-redundancy” in [1]. Making this reduction feasible re- 
quires testing for the partial order relation over 3-valued structures, which can be done 
in polynomial time for bounded 3-valued structures. The key question with this reduc- 
tion is whether the subsequent (performance) benefits of doing the reduction outweigh 
extra cost of performing the reduction. Our initial experience shows that this reduc- 
tion is worth using. This reduction transforms TVLA’s preorder over sets of 3-valued 
structures into a proper (Hoare powerdomain) partial ordering. 



6 Related Work 

A substantial body of literature exists on abstractions for various different domains and 
for creating new abstractions from existing abstractions. The distinguishing aspect of 
our work is its focus on heap abstractions and its focus on an empirical evaluation of 
the effectiveness of the proposed heap abstraction. 



Function Space Domain Construction. Function space domain construction is one 
way of creating abstractions that are “partly disjunctive”. Examples of previous work 
using such a domain construction include [5], where the abstraction is composed of two 
components - a lattice of symbolic access paths and a parametric numerical lattice. In 
this abstraction, abstract elements with the same symbolic access path component are 
merged by joining the numerical lattice component. The ESP system [4] also utilizes a 
similar function space domain construction, but not for heap abstractions. 



Least Disjunctive Basis. In [6], a technique is defined for obtaining the “least disjunc- 
tive basis”, which is the most abstract domain inducing the same disjunctive completion 
as another domain. Unfortunately, this may result in larger sets of abstract elements, as 
abstract elements are substituted by sets of other abstract elements, causing inflation. 
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Deflating Operators and Widening Operators. In [1], different widening operators 
and congruence relations are considered for the powerset polyhedra domain, and in 
more general settings. 
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Abstract. A method for generating polynomial invariants of impera- 
tive programs is presented using the abstract interpretation framework. 

It is shown that for programs with polynomial assignments, an invariant 
consisting of a conjunction of polynomial equalities can be automatically 
generated for each program point. The proposed approach takes into ac- 
count tests in conditional statements as well as in loops, insofar as they 
can be abstracted to be polynomial equalities and disequalities. The se- 
mantics of each statement is given as a transformation on polynomial 
ideals. Merging of paths in a program is defined as the intersection of 
the polynomial ideals associated with each path. For a loop junction, a 
widening operator based on selecting polynomials up to a certain degree 
is proposed. The algorithm for finding invariants using this widening 
operator is shown to terminate in finitely many steps. The proposed ap- 
proach has been implemented and successfully tried on many programs. 

A table providing details about the programs is given. 

1 Introduction 

There has recently been a surge of interest in research on automatic generation 
of loop invariants of imperative programs. This is perhaps due to the successful 
development of powerful automated reasoning tools including BDD packages, 
SAT solvers, model checkers, decision procedures for common data structures 
in applications (such as numbers, lists, arrays, ...), as well as theorem provers 
for first-order logic, higher-order logic and induction. These tools have been 
successfully used in application domains such as hardware circuits and designs, 
software and protocol analysis. 

A method for generating polynomial invariants for imperative programs is 
developed in this paper. It is analogous to the approach proposed in [6] for find- 
ing linear inequalities as invariants based on the abstract interpretation frame- 
work [5]. The proposed method, in contrast, generates polynomial equations as 
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invariants by interpreting the semantics of programming language constructs 
in terms of ideal-theoretic operations, which we consider by itself as an excit- 
ing novel contribution of this paper. The semantics of each statement is given 
as a transformation on polynomial ideals. It is shown that for programs with 
polynomial assignments, an invariant consisting of a conjunction of polynomial 
equalities can be automatically generated for each program point. 

The proposed approach is able to handle nested loops^ and also takes into 
account tests in conditional statements and loops, insofar as they can be ab- 
stracted to be polynomial equalities and disequalities. Merging of paths in a 
program is defined as the intersection of the polynomial ideals associated to 
each path. For ensuring the termination of the invariant generation procedure, 
a widening operator is proposed. This widening operator is based on retaining 
only the polynomials of degree < d in the intersection; this is achieved by com- 
puting a Grobner basis [7] with a graded term ordering and keeping only those 
polynomials in the basis with degree < d. The procedure for finding invariants 
using this widening operator is shown to terminate in finitely many steps. 

The proposed algorithm has been implemented using Macaulay2 [12], an 
algebraic geometry tool that supports operations on polynomial ideals such as 
the computation of Grobner bases. Using this implementation, loop invariants 
for several numerical programs have been successfully generated automatically. 

The method, as well as the implementation, do not need pre/postconditions 
for deriving loop invariants. Further, under conditions on the semantics of pro- 
grams, it finds all polynomial invariants of degree < d, where d is the degree 
bound in the widening. In that sense, the method is sound and complete. 

The rest of the paper is organized as follows. In the next subsection, related 
work is briefly reviewed. Section 2 gives background information on polyno- 
mial ideals, operations on them and special bases of polynomial ideals called 
Grobner bases. Section 3 introduces a simple programming language used in the 
paper for presenting the method. Section 4 discusses abstraction and concretiza- 
tion functions from variable values to ideals and viceversa, so that the framework 
of abstract interpretation is applicable. Section 5 gives the semantics of program- 
ming constructs using ideal-theoretic operations. For each kind of statement, it is 
shown how the output polynomial ideal can be constructed from the input poly- 
nomial ideals. Most importantly. Subsection 5.5 discusses the semantics of loop 
junction nodes using a widening operator. Section 6 shows that, under specific 
conditions of the semantics, the proposed method is sound and complete in the 
sense that, for every program point, our algorithm indeed finds all the invariants 
of degree < d, where d is the parameter used in the widening operator. Section 7 
illustrates the application of the method on some examples; this is followed by a 
table giving details of programs successfully tried for which our implementation 
discovers loop invariants. Section 8 concludes and discusses ideas for extending 
this research. 



^ The method also works for unnested loops with spaghetti control flow, using Bour- 
doncle’s algorithm [1] to find adequate widening points in the control-flow graph. 
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1.1 Related Work 

As stated above, the proposed approach is a complement of the method proposed 
by Cousot and Halbwachs [6], who applied the framework of abstract interpre- 
tation [5] for finding invariant linear inequalities. That work extended Karr’s 
algorithm in [16] for finding invariant linear equalities at any program point. 

Recently, there has been a renewed surge of interest in automatically deriving 
invariants of imperative programs. In [4] Colon et al. have used non-linear con- 
straint solving based on Farkas’ lemma to attack the problem of finding invariant 
linear inequalities. Extending Karr’s work, for programs with affine assignments 
Miiller-Olm and Seidl [18] proposed an interprocedural method for computing 
polynomial equations of bounded degree as invariants. In [21], we developed 
an abstract framework for generating invariants of loops. This framework was 
instantiated to generate conjunctions of polynomial equations as invariants for 
loop programs. The method used the Grobner basis algorithm for computing such 
invariants, and was shown to be sound and complete. However, that method can- 
not handle nested loops; furthermore, tests in conditional statements and loops 
are abstracted to be true. In [22] , a method is proposed for generating nonlinear 
polynomials as invariants, which starts with a template polynomial with unde- 
termined coefficients and attempts to find values for the coefficients so that the 
template is invariant using the Grobner basis algorithm. 

In contrast, not only can the method proposed in this paper generate invari- 
ants of programs with nested loops, but also it is not necessary to know a priori 
the structure of the polynomials appearing as invariants. However, the widening 
operator does need as input the degree of the polynomial invariants of interest to 
the user. Furthermore, unlike the methods of [22], the proposed method has been 
implemented and tried on many examples with considerable success. We believe 
that the technique discussed in [6] for linear inequalities can be easily integrated 
with our method since they share the framework, thus resulting in a powerful 
effective method for automatically generating loop invariants expressible using 
linear inequalities and polynomial equalities. 

2 Preliminaries 

Given a field K, let K[x] = K[xi,...,x„] denote the ring of polynomials in the 
variables Xi,...,Xn with coefficients from K. An ideal is a set / C K[x] which 
contains 0, is closed under addition and such that if p G K[a;] and q & I, then 
pq G I. Given a set of polynomials S C K[x], the ideal spanned &?/ S' is {/ G 
K[a;] 1 > 1 / = ^j^iPjqj with G K[x],( 7 j G S}. This is the minimal 

ideal containing S and we denote it by (S)ik[x] or simply by (S). For an ideal 
/ C K[x], a set S C K[x] such that I = (S), is called a basis of I and we say 
that S generates I. 

Given two ideals I, J Q K[x], their intersection I G J is an ideal. However, 
the union of ideals is, in general, not an ideal. The sum of I and J, I + J = 
{p+q\p€l,qG J}, is the minimal ideal that contains / U J. The quotient of 
I into J is the ideal / : J = {p | Vg G J,pg G /}. 
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For any set S of polynomials in IK [a;], the variety of S over IK” is defined as 
its set of zeroes, V(S') = {w G IK”| p{ui) = 0 Vp G S}. When taking varieties 
we can assume S to be an ideal, since V((S')) = V(S'). On the other hand, if 
A C IK” the ideal 1(A) = {p g IK[x] | p(a)) = 0 Vw G A} is called the ideal of A. 
We write IV(5') instead of I(V(5')) and VI(A) instead of V(I(A)). 

Ideals and varieties are dual concepts, in the sense that given ideals I, J, 
V(/ n J) = V(/) U V( J) and V(/ + J) = V(/) n V( J). Moreover, if / C J then 
V(/) A V(J). Analogously, if A, i? C IK" (in particular, if A, B are varieties), 
then I(A U B) = 1(A) n 1(B) and A C B implies 1(A) A 1(B). However, in 
general for any two varieties V, W the inclusion I{V n W) A 1(H) + I(hF) holds 
and may be strict; but 1{V n W) = IV(I(H) + I(IF)) is always true. 

For any ideal the inclusion I C IV (/) holds; IV (/) represents the largest set 
of polynomials with the same set of zeroes as I. Since any I satisfying I = IV(/) 
is the ideal of the variety V (/) , we say that any such I is an ideal of variety. For 
any A C IK”, it can be seen that the ideal 1(A) is an ideal of variety. Moreover, if 
/ is an ideal of variety, then I(V(/) — V(J)) = I : J, where — denotes difference 
of sets. For further detail on these concepts, see [8,7]. 

A term in a set x = (cci, ..., x„) of variables is an expression of the form 
x°‘ = xf^xf'^ ■ ■ ■ a;“", where a = (oi, ..., a„) G N”. The set of terms is denoted 
by T. A monomial is an expression of the form c • p, with c G IK and p € T. The 
degree of a monomial c • with c yf 0 is deg(c • a;“) = oi + • • • + a„. The degree 
of a non- null polynomial is the maximum of the degrees of its monomials. We 
denote the set of all polynomials of IK[a;] of degree < d by IKj;[a;]. 

An admissible term ordering is a relation over T such that: 

1 . is a total ordering over T. 

2. If a,P,j€ N” and ^ x^ , then ^ 5^+^. 

3. Vd G N", o;“ ^ I = x^. 

Moreover, is called a graded term ordering G N", deg(a;“) > deg(xd) 

implies a;“ x^ . 

Term orderings extend to monomials by ignoring the coefficients and com- 
paring the corresponding terms. The most common term orderings are defined 
as follows, assuming that x\ X 2 Xn'- 

— Lexicographical Ordering (lex). If d,/3 G N", then x°" >iex iff the 
leftmost nonzero entry in d — /3 is positive. 

— Graded Lexicographical Ordering (grlex). If d,,5 G N", then >griex 
x^ iff deg(o;“) > deg(ir^), or deg(ir“) = deg(a;^) and x°‘ )^iex x^ ■ 

— Graded Reverse Lexicographical Ordering (grevlex). If d,/3 G N", 
then x^ >-greviex x^ iff deg(x“) > deg(al^), or deg(ir“) = deg(x^) and the 
rightmost nonzero entry in d — /? is negative. 

Each of these orderings is a total ordering on terms. The orderings grlex and 
grevlex are examples of graded term orderings. 

Given a term ordering )^, for any polynomial / G IK[a;], lm(/) stands for the 
leading monomial of / with respect to Given an ideal I different from {0}, 
a Grdbner basis for / is a finite set of polynomials G = {gi,...,gk} satisfying 
({lm(p) I p G /}) = (lm((/i), ...,lm(pfc)). For such a set G, I = (G) holds (and so 
G is indeed a basis for I). 
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3 Programming Model 

To simplify presentation, a program is represented as a finite connected fiowchart 
with one entry node, assignment, test, junction and exit nodes, as in [6]. We also 
assume that the evaluation of arithmetic and boolean expressions has no side 
effects and so does not affect the values of program variables, which are denoted 
by xi, 

Formally, nodes for flowcharts are taken from a set Nodes, which is parti- 
tioned into the following subsets (we show between parentheses the respective 
symbol for pictures): 

1. Entry There is just one entry node, which has no predecessors and 

one successor. It means where the flow of the program begins. 

2. Assignments (□). Assignment nodes have one predecessor and one succes- 
sor. Every assignment node is labelled with an identifier Xi and an expression 
f{x), thus representing the assignment Xi := f{x). 

3. Tests (o). A test node has a predecessor and two successors, corresponding 
to the true and false paths. It is labelled with a boolean expression C{x), 
which is evaluated when the flow reaches the node. 

4. Junctions (O)- Junction nodes have one successor and more than one 
predecessor. They involve no computation and only represent the merging 
of execution paths (in conditional and loop statements) . 

5. Exits (^<i). Exit nodes have just one predecessor and no successors. They 
represent where the flow of the program halts. 

For example, the program below incrementally computes the sequence of 
squares for the first X 3 natural numbers, stored in the variable x\. 

4 Ideals of Varieties as Abstract Values 

The state of the computation at any given node in a program is the set of values 
each program variable can take. This is represented as a subset of K”. Program 
constructs change the state. A state can be abstracted to the ideal consisting of 




Fig. 1. Example of program 
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all polynomials that vanish in that state. This is how the abstraction function 
is intuitively defined. 

At the abstract level, we work with polynomial ideals (more specifically, 
with ideals of variety, i.e. with ideals / such that I = IV(/)). To each arc a of 
the flowchart, we attach an assertion Pa of the form Pa = {/\’j=iPaj{x) = 0}, 
or equivalently the ideal la = {pai{x), ...,Pak{x)) , where the Paj G K[x] are 
polynomials. The abstraction function, 

a = I:2"^" 

is the ideal operator, which yields the ideal of the polynomials that vanish at 
the points in the given subset of K”; and the concretization function, 

7 = V : X ^ 2'^” , 

is the variety operator (where 2* denotes the powerset of K" and X is the 
set of ideals of variety in K[a;]). Both (2*”, C, U, 0, K”) and (X, X, n, (1), (0)) 
are semi-lattices, and the functions defined above are morphisms between these 
semi-lattices. These operators form a Galois connection, as VA C K” V/ G X, 
1(A) A / <^ A C V(/). The semantics of program constructs for abstract values 
is given later on as transformations on polynomial ideals. 

The algorithm for computing the invariant ideal of variety for each program 
point works as follows. The output ideal of the entry node represents the precon- 
dition, i.e. what is known about the variables at the start of the execution of the 
program. Assuming at first that variables are undefined on any arc (i.e. la = (1), 
the bottom of X), we propagate the precondition ideal around the flowchart by 
application of the semantics until stabilization. In order to guarantee termina- 
tion, we assume that each cycle in the graph contains a special junction node, 
called loop junction node, for which the respective assertion is approximated us- 
ing a widening operator V. Intuitively, loop junction nodes correspond to loops, 
whereas simple junction nodes are associated to conditional statements. 

5 Transformation of Ideals of Variety 
by Language Constructs 

This section develops a semantics of programs for ideals of variety, i.e., for each 
kind of program node, we show how the output ideal of variety can be obtained 
in terms of the input ideals of variety and the relevant information attached to 
the node. 

5.1 Program Entry Node 

If we are given a precondition for the procedure to be analyzed, the IV(-) of 
the polynomial equations in it can be used as the output ideal of variety for 
the program entry node. Otherwise, if the variables are assumed not to be ini- 
tialized, they do not satisfy any constraints and the vector of their values may 
be any point in K". This is represented by the zero ideal (0) = I(IK”), whose 
corresponding assertion is the tautology 0 = 0. 
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5.2 Assignments 

Let / = (pi, ...jPk) be the input ideal of variety of the assignment node, Xi be 
the variable that is assigned and f{x) be the right-hand side of the assignment. 

The strongest postcondition of the assertion {A^=i Pji^) = 0} after the 
assignment Xi := f{x) is 

{Bx'i(xi = f{xi ^ x',) A ^ x'i) = 0))} , 

where intuitively x[ stands for the value of the assigned variable previous to the 
assignment and ^ denotes substitution of variables. Our goal now is to translate 
this formula in terms of ideals of variety. 

Let us assume f{x) S K[a;]. We translate the equality Xi = f{xi <— xQ into 
the polynomial Xi — f{xi <— x[) and consider the ideal 

I' = - f{xi ^ x'i),pi{x, ^ x'i), ...,Pfe(xj ^ A))kK,£] ■ 

This ideal /' captures the effect of the assignment, with the drawback that 
a new variable a;' has been introduced. We have to eliminate this variable a;' 
from /' and then compute the corresponding ideal of variety; in other words, 
we need to compute all those polynomials in /' that depend only on the x 
variables, i.e. /' n K[ai], and then take IV(/' n K[x]). As it can be proved that 
IV(/' n K[x]) = /' n K[a;], the final output is /' n K[a;]. 

In our running example, assume that /q = (0) and that we want to compute 
the output ideal Ii of the assignment x\ := 0. Applying the ideas above, we take 
(a:i) n K[a;] = (xi). This means that if we know nothing about the variables and 
apply the assignment x\ := 0, then a;i = 0 after the assignment. 

Invertible Assignments. A common particular case is the following: f{x) = 
c- Xi + f'{xi , ..., Xi-i,Xi+i , ..., Xn), where c G K, c yf 0 and /' does not depend 
on Xi- Then the assignment is invertible, and we can express the previous value 
of the variable Xi in terms of its new value. It is easy to see that in this case 

I' = {x'i - -{Xi - f{xi,...,Xi-i,X^+i, ...,Xn)),Pl{Xi ^ x'i), ...,Pk{Xi ^ x')) . 
c 

To eliminate a;' from we substitute x' by ^ • {xi — f'{xi , ..., Xi-i,Xi+i , ..., Xn)) 
in the pj. The output is (U^^j^{pj(x^ ^ ^ ■ (x^ - f'(xi, x^-i, Xi+i, x„')')')}}. 

For instance, assume that I 4 = {x\,X 2 ) (i.e. x\ = X 2 = 0) and that we want 
to compute the output ideal of the assignment x\ ■.= x\ + 2 X 2 + 1. As the 
right-hand side of the assignment has the required form, we take = I^ixi <— 
xi — 2 x 2 ~ 1) = (a^i ~ 2x2 — 1 , 2 ^ 2 )- Then, at program point 5, the variables 
satisfy x\ = 2 x 2 + 1 and X 2 = 0, which is consistent with the result of applying 
xi := xi -I- 2 * X 2 -I- 1 to (xi, X 2 ) = (0, 0). 

5.3 Test Nodes 

Let C = C{x) be the boolean condition attached to a test node with the input 
ideal I = {pi,...,pk). Then the strongest postconditions for the true and false 
paths are respectively 
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{C(a;) A =0)}, {^C{x) ^ = 0)}. 

For simplicity, below we just show how to express the assertion for the true 
path in terms of ideals when C is an atomic formula. More complex boolean 
expressions can be handled easily [14]. 

Polynomial Equalities. If C is a polynomial equality, i.e., it is of the form 
g = 0 with q G K[a;], then the states of the true path are Viq) n V(/); in this 
case we take as output 

IV((g) + /)=IV((g,pi,...,pfc)), 
since V{{q) + I) = V{q) n V(J). 

For instance, assume that in our example, I 3 = {x\ — x|) and we want 
to compute the output ideal /y of the false path. Now C{x) = (x 2 X3), 

and so ^C{x) = {x 2 = X3). According to our discussion above, then I'j = 
IV {x2 — X3, Xi — X2) = {x2 — X3, Xi — x|), which means that at program point 
7, X 2 = X 3 and x\ = 

Polynomial Disequalities. If C is a polynomial disequality, i.e. it is of the form 
g ^ 0 with q G K[x], then the states of the true path are the points that belong 
to V(/) but not to V(q), in other words V(J) — V(g). So the output should be 
the ideal of the polynomials vanishing in this difference of sets, I(V(I) — V(q)). 
As it can be proved that I(V(I) — V(g)) = I : (q), we take / : (q) as output. 

For example, if the input ideal of the test node with condition C{x) = {x\ yf 
0) is / = {X 1 X 2 ) (either = 0 or X 2 = 0), the output for the true path is 
{X 1 X 2 ) '■ {x\) = {X 2 ) , which means that, after the test, we have that a ;2 = 0 on 
the true path. 

Polynomial Inequalities. Over K = K, Q, a polynomial inequality q > 0 
(or q < 0) cannot be made equivalent to a boolean combination of polynomial 
equalities. In this case we have to perform an approximation of C to polynomial 
dis/equalities. For both q > 0 or g < 0, we approximate it by g yf 0. 

5.4 Simple Junction Nodes 

Typically, simple junction nodes correspond to the merging of the execution 
paths of conditional statements. In general, if the input ideals of variety /i, 
are such that for 1 < f < ^, we have li = (pn, ...,pik), then the strongest 
postcondition after the execution of the simple junction node is 

{V-=i(A)=iPy(x) = 0)}. 

Then the output ideal of variety has to be I(u(^;^V(/i)) = n(^^IV(/i) = 
since the li are ideals of variety and so satisfy li = IV (7^). 

5.5 Loop Junction Nodes 

Intuitively, a loop junction node represents the merging of the execution paths of 
a while statement. As the following example illustrates, if we treat loop junctions 
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as simple junctions, the forward propagation procedure may not terminate. That 
implies that we need to approximate. 

For instance consider the loop junction in the running example, with in- 
put arcs 2,6 and output arc 3. Assume that I2 = {x\,X2) (so x\ = X2 = 0), 
I3 = (a^i — X2tX2{x2 — 1)) (either xi = X2 = 0 or = X2 = 1) and Iq = 
(xi — (x2 — l)(x2 — 2)) (either {xi,X2) = (1,1) or (a;i,a;2) = (4,2)). The 

new value for should be 

/s n /2 n /e = 

= {xi - x\,X 2 {x 2 - 1)) n {xi,X 2 ) n {xi - x\, (X 2 ~ 1 )(X 2 ~ 2)) = 

= (Xi - X2,X2 (x2 - 1)(X2 - 2 )) . 

Notice that the solutions for the polynomials above are such that xi = x^ and 
either X 2 = 0 or a ;2 = 1 or X 2 = 2, which is consistent with the behaviour of 
the loop, since the semantics captures the effect after the loop body has been 
executed < 2 times. 

At the next step of the forward propagation procedure, I2 = {x\,X2), I3 = 
{xi — X 2 (x 2 — l)(x 2 — 2)) and Iq = {xi — (x 2 — l)(x 2 — 2)(x2 — 3)). Then 

the next value for I3 should be 

/s n l 2 n le = {xi - xl,X2{x2 - 1)(X2 - 2)(x2 - 3)) . 

After t iterations of the forward propagation procedure, thus, 

t+i 

h = {xi -xl, J]^(a ;2 - s)) . 

s=0 

It is clear that only the first polynomial Xi — a;| yields an invariant for the 
loop, as it persists to be in I 3 after arbitrarily many executions of the loop. 

In [20], we gave an algebraic geometry-based approach to capture the effect 
of arbitrarily many iterations. Ideal-theoretic manipulations were employed to 
consider the effect of executing a path arbitrarily many times using new param- 
eters standing for the number of times a path is executed and then eliminating 
those parameters using quantifier-elimination and projection. 

An approximate method is proposed below using a widening operator, similar 
to the approach for linear inequalities based on abstract interpretation [5, 6] 

Widening Operator. Let I be the output ideal of variety associated with a 
loop junction node, lant be its previous value and Ji, ..., J; be the input ideals 
going into the loop junction node. An upper approximation of the set of states 
V(/ant) U (u(^^V(Ji)), or by duality a lower approximation of lant H (n(^^ Ji), 
needs to be computed; the polynomials in the intersection should be picked so 
that: 

i) the result is still sound, i.e., all values of variables possible at the junction 
node are accounted for, 

ii) the procedure for computing invariants terminates; and 

Hi) the method is powerful enough to generate useful invariants. 




An Abstract Interpretation Approach for Automatic Generation 289 



Formally, we introduce a widening operator V so that lant ^ replaces 
I ant n In this context: 

Definition 1. A widening V is an operator between ideals of variety such that: 

1. Given two ideals of variety I and J, then IV J C I D J (so that V{IVJ) 3 
V(/ n J) as we do not wish to miss any states). 

2. For any decreasing chain of ideals of variety Jq 3 Ji D ... D Jj D ..., the 
chain defined as Iq = Jq, Ij+i = IjVJj+i is not an infinite decreasing chain. 

These two properties take care of the conditions i) and ii) mentioned earlier. 
As regards Hi), in Sections 6 and 7, we will give evidence that our choice of the 
widening operator is quite powerful. 

Definition 2. Given two ideals of variety I,JQ K[a;], d G N and a graded term 
ordering >- (such as grlex, grevlexj, we define IV dJ as 

IV dJ = IV({p G GB{I n J, I deg(p) < d}) = 1V{GB{I n J, n Kd[x]) , 

where GB{I, ;^) stands for a Grobner basis of an ideal I with respect to the term 
ordering >-. 

Theorem 1. The operator V d is a widening. 

Proof. It is easy to see from the following relation that given two ideals of variety 
I,JC then IV dJ C / n J: 

IV dJ = IV{GB{I DJ,y)n Kd[x]) C IV{GB{I n J, ^)) = 

= iv(7 n J) = iv(d) n iv( J) = J n J . 

Now let us prove that for any decreasing chain of ideals Jo 2 D ... D D ..., 
the chain defined as Iq = Jq, dj+i = IjV dJj+i is not an infinite decreasing chain. 
Since /q I\ 2 H I j H we also have the decreasing chain 

Iq n Kd[a;] 3 n D ... D n A ... . 

But each /j nIKd[a;] is a K-vector space: if p, g G Ij nIKd[a:], then p+q G Ij nIKd[a;], 
as Ij is an ideal and Kd[a;] is closed under addition; and if p G Ij n and 

A G K, we can consider A G IK[x] and since Ij is an ideal, A • p G /j n Kd[x]. So 
taking dimensions (as vector spaces) we have that 

dim(/o n IKd[x]) > dim(/i n Kd[a;]) > ... > dim(/j n IKc;[a;]) > ... . 

But this chain of natural numbers cannot decrease indefinitely, as it is bounded 
from below by 0. Therefore there exists z G N such that Vj > i dim(/i nKd[a;]) = 

dim(/j n Kd[x]). We can assume that z > 1 without loss of generality. As U n 

Kd[a:] 2 Ij n Kd[x] and the two vector spaces have the same dimension, we get 
the equality Ii nIKd[a;] = Ij nKd[a;]. Since z > 1 there exists S C Kd[a;] such that 
Ii = IV(S') (namely, S = GB{Ii-i n Ji, )^) n Kd[a;]). Then 

h = IV(^) c iv(d, n Kd[x]) = iY{ij n Kd[x]) c IV(/,) = ij , 

as Ij is an ideal of variety. But by construction, we already know that Ii 3 Ij-, so 
Ii = Ij, which implies that the chain must stabilize in a finite number of steps. 
□ 




290 Enric Rodrfguez-Carbonell and Deepak Kapur 



Applying the Widening. Let us apply the widening to our running example 
for d = 2. Assume that I 2 = {xi,X 2 ), h = {xi — x\tx\ — 6x2X1 + llxi — 6x2) 
and /e = (xi — X2,xf — 10xiX2 + 35xi — 50x2 + 24). Taking the graded term 
ordering )^=grevlex(xi > X2), 

13^2(12 n Iq) = IV n /2 n Jg, )^) n iK2[x]) = 

= IV ({xi — X2, x^X2 — lOx^ + 35 xiX 2 — 50xi + 24x2, 

x\ — 65xi + 300 xiX 2 — 476xi + 240x2} H K2[x]) = IV(xi — X2) = (xi — X2) . 

Example 1. Here we give the first iteration of the forward propagation algorithm 
on our running example for d = 2. Due to lack of space, we cannot provide the full 
trace; for more details about the algorithm as well as the trace of the algorithm, 
please consult [19]. 

The calculations are done using )^=grevlex(xi > X2). By definition, Vj : 
0 < j < 7, = (1). Assuming nothing about the values of variables at the 

entry point, = (0). After the assignments xi := 0 and X2 := 0 (which are 
not invertible), respectively 

^ = ((xi) + (0)) n K[x] = (xi), 

4^^ = ((^ 2 } + (xij) n IK[x] = (xi, X2). 

When dealing with the loop header, since = (1), 

= jf V2(4'^ n /®) = 4') = (X1,X2). 

When taking the true output path, 

4 ^^ = 4 ^^ ■ (^2 - X3} = (xi, X2). 

The assignments xi := xi + 2 * X2 + 1 and X2 := X2 + 1 are invertible, and so: 

4^^ = 4^4a^i ^ Xi - 2X2 - 1) = (xi - 2X2 - 1,2:2), 

4^^ = 4^42^2 <— 2:2 — 1) = (xi — 2x2 + 1,2:2 — 1). 

Finally, taking the false output path of the loop test we add the condition X 2 —X 3 : 
4^^ = IV((X2 - X3) + 4^4 = IV(xi,X2,X2 - X3) = (X1,X2,X3). 

Of the subsequent iterations, we just show the computation of /g , which 
corresponds to the above example illustrating the application of the widening 
operator: 

4®^ = 4"*^ ^2(4^^ n 4'‘4 = IV({xi - xl xjx 2 - lQx\ + 35X1X2 - 50X1 + 24x2, 

Xi — 65xi + 300 xiX 2 — 476xi + 240x2} H K2[x]) = IV (xi — X2) = (xi — X2) . 

Then, Vz : 0 < z < 7, if''^ = lf‘\ In this case, the widening operator accom- 
plishes its function and the algorithm stabilizes in 6 iterations, yielding the loop 
invariant {xi = x|}. 
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6 Completeness 

We show in this section that, under certain assumptions on the semantics of 
program constructs, the method is complete for finding polynomial invariants 
up to the degree d, where d is the parameter in the widening. We simplify the 
semantics as given in Section 5 as follows: firstly, conditions in test nodes are 
considered to be true^; further, all assignments are assumed to be linear (i.e., of 
the form Xi := p(x), with p a polynomial of degree 1). 

The ideal-theoretic semantics of program constructs is used to associate a 
system of equations I = F{I) to a, program, where the unknowns are the invari- 
ant ideals and F is an expression using sum, intersection and quotient of ideals 
and elimination of variables. The least of the solutions to this fix point equation 
with respect to 3 can be shown to yield the optimal invariants; but, in general, 
it cannot be computed in a finite number of steps by applying forward prop- 
agation. The above proposed widening approximates the intersection of ideals 
when handling loop junction nodes with a loss of completeness of the method. 
The following theorem, however, shows that the widening is fine enough so as to 
keep all those polynomials of degree < d of any fix point (for a proof, see [19]). 

Theorem 2. Let I* be a fix point of the application F given by the semantics of 
a program (without widening). Let be the approximation obtained at the i-th 
iteration of the forward propagation procedure using Vd instead of intersection 
at loop junction nodes. Then Vf G N and Va program point, I* n Kc;[x] C ijf'^ . 

In particular, I* may be the least fix point of F with respect to 3. Therefore, 
on termination of the approximate forward propagation with widening, the the- 
orem guarantees that we have computed all the invariant polynomials of degree 
< d. The proof is by induction over i. The inductive step is proved by consid- 
ering all possible cases of program points and checking that all polynomials of 
degree < d of the fix point are retained. For that we use the key property of 
the widening: the approximation includes all polynomials of degree < d of the 
intersection; in other words, given I, J ideals, I n J n Kd[a;] C I\/dJ. 

7 Examples 

We have implemented a modified version of the above method on Macaulay2 
([12]), an algebraic geometry tool that supports the ideal-theoretic operations 
needed in the method. In the implementation, the semantics of programs as 
given in Section 5 has been simplified: i) in order to speed up the algorithm, the 
IV (•) computations are not performed (although in practice for all our examples, 
some of which are shown in this section, we found the expected invariants without 
losing any information), ii) coefficients of terms in polynomials are considered 
over a finite field (with a large prime) for Grdbner basis computations. Hi) the 
only boolean conditions considered are polynomial equalities, and iv) various 
paths in conditional statements are incrementally selected. 

^ As stated earlier, the method can deal with polynomial dis/equalities in test nodes. 
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Since we are interested in determining nonlinear invariants, we start with the 
value of d to be 2. If that does not work, then we increment the value. 

Example 2. In order to compare the techniques, the first example has been ex- 
tracted from [22]. It is a program that, given two natural numbers a and b, 
computes simultaneously the gcd and the Icm, which on termination, are x and 
u + v respectively. Notice that the program has nested loops. 

var a, b, x, y, u, v: integer end var 

{x,y,u,v):={a,b,b,0); 

while X y do 

while X > y do (cc, v):=(x — y,u + v); end while 
while X < y do {y, u):={y — x,u + v); end while 
end while 

Our implementation gives the same invariant for the three loops, {xu + yv = 
ab}, computed in 1.96 sec. (using d = 2). On termination of the outer loop, for 
which the invariant {gcd{x,y) = gcd{a,b)} can be found by other methods, we 
have X = y/\gcd{x, y) = gcd{a, b) Axu + yv = ab, which implies u + v = lcm{a, b). 

Example 3. The next example is an implementation of extended Euclid’s algo- 
rithm to compute Bezout’s coefficients {p, r) of two natural numbers x, y (see 
[17]), using a division program extracted from [3]. Notice that it has several 
levels of nested loops and non-linear polynomial assignments. 

var x, y, a, b, p, q, r, s: integer end var 
{a,b,p,q,r, s):=(x, y, 1, 0, 0, 1); 
while & yf 0 do 

var c, k: integer end var 
(c, fc):=(a,0); 
while c> b do 

var d, D: integer end var 
(d,i9):=(l,6); 

while c> 2D do {d, D):={2d,2D); end while 
(c, k):={c — D,k + d); 
end while 

(a, b,p, q, r, s):={b, c,q,p- qk, s,r- sfc); 

end while 

We get the following invariants in 9.34 sec. using d = 2: 

1. Outermost loop: {px + ry = a A qx + sy = b}. 

2. Middle loop: {px + ry = aAqx + sy = bAkb+c = a}. 

3. Innermost loop: {px+ry = aAqx+sy = bAkb+c = aAdb = DADk+dc = da{. 

The invariant of the outermost loop {px + ry = a A qx + sy = b{ ensures that 
(p, r) is a pair of Bezout’s coefficients for x, y on termination of the program. 

Example 4- The following example is a version of a program in [17] that tries to 
find a divisor d of a natural number N using a parameter D: 
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var N, D, d, r, t, q: integer end var 

(<i, r, t, q):={D, N mod D, N mod {D — 2), 4(7V div (D — 2) — N div D)); 
while d < ['n/IvJ A r ^ 0 do 
if 2r — t + q < 0 then 

{d, r, t, q):={d + 2,2r — t + q + d + 2,r,q + 4); 
else if 0<2r — t + q<d+2 then 
{d, r, t):={d + 2,2r — t + q, r); 
else if (i + 2<2r — f + <7<2(i+4 then 

(d, r, t, q):={d + 2,2r — t + q — d — 2,r,q — 4); 

else 

{d, r, t, q):={d + 2,2r — t + q — 2d — 4,r, q — 8); 

end if 
end while 

This is the most nontrivial program we have attempted. With d = 2, after 
7.86 sec. we do not get any invariant; with d = 3, the invariant {d{dq — 4r + 
4t — 2q) + 8r = 8iV} is generated in 48.82 sec. Even though we abstract the tests 
to be trrte, this is a strong enough polynomial invariant; together with other 
non-polynomial invariants, it can be used to prove that on termination, if r = 0 
then d is a divisor of N . 

Other Examples. The table below summarizes the results obtained using our 
implementation on other examples^. The execution times above as well as in the 
table are for a Pentium 4 2.5 GHz. processor with 512 MB of memory. There is 
a row for each program; the columns provide the following information: 

1. 1st column is the name of the program; 2nd column states what the program 
does; 3rd column gives the source where the program was picked from (the 
entry (*) is for the examples developed up by the authors). 

2. 4th column is the bound d for the widening operator. 

3. 5th column gives the number of variables in the program; 6th column gives 
the number of conditionals; 7th column is the number of loops; 8th column 
is the maximum depth of nested loops. 

4. 9th column is the number of polynomials in the loop invariant for each loop. 

5. 10th column gives the time taken by the implementation (in seconds). 

8 Conclusions 

We have presented an approach based on abstract interpretation for generating 
polynomial invariants of imperative programs. The techniques have been imple- 
mented using the algebraic geometry tool Macaulay2 [12]. The implementation 
has successfully computed invariants for many nontrivial programs. Its perfor- 
mance is very good as evident from the above table. 

In the proposed method, the semantics of statements is given using ideal- 
theoretic operations; this is a novel idea in contrast to the axiomatic semantics 
or denotational semantics typically given for program constructs. Obviously, only 

These examples are available at www.lsi.upc.es/~erodri 
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Table 1. Table of examples 



PROGRAM 


COMPUTING 


SOURCE 


Q 




m 


nwwa 


DEPTH 


tiBia 


TIME 


cohencu 


cube 


[3] 


s 


5 


a 


1 


1 




2.45 


dershowitz 


real division 


[9] 


s 


7 


1 


1 


1 




1.71 


divbin 


integer division 


[13] 


a 


5 


1 


2 


1 


aa 




euclidexl 


Bezout’s coefs 


[17] 


a 


\m 


a 


2 


2 


3-4 


|7.15 1 


euclidex2 


Bezout’s coefs 


(*) 


a 


8 


□ 


1 


1 


5 




fermat 


divisor 


]2] 


a 


5 


a 


3 


2 










(*) 


a 




a 


1 


1 


1 








[11] 


a 




a 


1 


1 


1 




|hard | 


integer division 


[22] 


a 




1 


2 


1 




|2.19 1 




1cm 




a 




1 


1 


1 


1 


fcliM 








a 


6 


a 


1 


1 


2 





certain kinds of statements can be considered this way; in particular, restrictions 
on tests in conditionals and loops, as well as on assignments, must be imposed. 
However, using the approach discussed in [15], where an ideal-theoretic interpre- 
tation of first-order predicate calculus is presented, it might be possible to give 
an algebraic semantics of arbitrary programming constructs using ideal-theoretic 
operations. This needs further investigation. 

Another issue for further research is the widening operator for the semantics 
of loop junctions. The widening here presented, which retains polynomials of 
degree less than or equal to a certain a priori bound, works very well. But we 
will miss out invariants if the guess made for the upper bound on the degree of 
the invariants is incorrect. In that sense, the proposed method is complementary 
to our earlier work in [20] , in which no a priori bound on the degree of polynomial 
invariants needs to be assumed. 

Also, since the method here introduced is based on abstract interpretation, 
it will be easy to integrate it with the techniques for generating invariant linear 
inequalities discussed in [6]. Such an integration will result in an effective pow- 
erful method for generating loop invariants expressed as a combination of linear 
inequalities and polynomial equations, thus handling a large class of programs. 
In contrast, we do not see how this is feasible with the recent approaches pre- 
sented in [22]. The use of the abstract interpretation framework is also likely 
to open doors for extending our approach to consider programs manipulating 
complex data structures including arrays, records and recursive data structures. 
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Abstract. We present a novel static analysis for approximating the 
algebraic relational semantics of imperative programs. Our method is 
based on abstract interpretation in the lattice of polynomial pseudo ide- 
als of bounded degree - finite-dimensional vector spaces of polynomials 
of bounded degree which are closed under bounded degree products. For 
a fixed bound, the space complexity of our approach and the iterations 
required to converge on fixed points are bounded by a polynomial in the 
number of program variables. Nevertheless, for several programs taken 
from the literature on non-linear polynomial invariant generation, our 
analysis produces results that are as precise as those produced by more 
heavy-weight Grobner basis methods. 



1 Introduction 

The relational semantics of a program characterize its input-output behavior as 
a relation on states [22,21]. A pair of states is included in the relation if the 
program, when started in the first state, can halt in the second. The algebraic 
relational semantics approximate the relational semantics by a system of polyno- 
mial equalities in the variables V, denoting the values of program variables in the 
initial state, and V , denoting the values in the final state. The expressiveness 
provided by non-linear polynomial equations allows for more precise approxi- 
mations of program behavior than is possible with linear equalities [15]. In the 
presence of loops, even programs with purely linear assignments and guards can 
have input-output relations that cannot be adequately approximated linearly. 

Consider the program presented in Fig. 1(a), which contains only linear as- 
signments. Assuming the variable x is initially non-negative, the program halts 
with the final value of z equal to the square of the initial value of x. When x is 
initially negative, the program never terminates. More formally, the program’s 
relational semantics are given by 

\x = X A y' = X t\z' = x^], 

where x\ y' , and z' represent the values of the variables in the final state. There 
are several uses for this relation: A program verification system might view it 

* This research was supported by the Office of Naval Research. 
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var 


x,y,z : integer; 


var 


x,y,z 


h : 


(y,z) := (0,0); 


£i : 


(y,z) 


£2 : 


while y ^ X do 


£2 : 


halt 




G : (y,z) := {y + l,z + 2y + 1)- 






£4 : 


halt 







integer; 



(a) 



(b) 



Fig. 1. Computing the square 



as an abstraction of the program suitable for compositional reasoning. A re- 
engineering system might annotate the program with the relation to document 
the program’s behavior. A compiler, noting that the relation defines a function, 
might produce the more efficient and terminating version shown in Fig. 1(b). 

The algebraic relational semantics can be approximated directly by abstract 
interpretation [4,3]. Alternatively, traditional invariant generation methods [12, 
27, 16, 5] can be applied by first augmenting the program with auxiliary variables 
which preserve the initial values of program variables, and then approximating 
the set of reachable states. This reduction, however, may cause some analyses 
to become infeasible, z.e., those with high complexity in the number of program 
variables. For example, the extreme points and rays of a polyhedron may exceed 
the available memory while computing convex hulls in linear relation analysis [7, 
14]. When approximating the relational semantics of a program, we must be 
attentive to the complexity of our methods in terms of the number of variables, 
since there are twice as many - representing the initial and final values. 



Related Work. Recently, a number of proposals have been advanced for gen- 
erating non-linear invariants of imperative programs: Sankaranarayanan et al. 
propose a constraint solving approach for generating non-linear invariants of 
bounded degree [25] . They first construct Grobner bases of the polynomial sys- 
tems expressing the initial states of the program and its transition relation, 
and extract from these bases a system of constraints characterizing the induc- 
tive polynomial equalities of a given degree. They then generate invariants by 
solving these constraints. The principal advantage of their method is that it elim- 
inates the need for heuristic widening. The disadvantages are that the resulting 
constraints are not easily solved and that it is necessary to guess the number of 
equations when generating mutually inductive invariants. 

Rodriguez Carbonell and Kapur propose another non-linear invariant gener- 
ation method based on Grobner bases [24]. While not presented as such, their 
approach is essentially an abstract interpretation in the lattice of polynomial 
ideals, with the proviso that all assignments appearing in the program be in- 
vertible and satisfy a certain condition which permits quantifier elimination by 
substitution. They do not provide a widening, but rather present conditions un- 
der which their method is guaranteed to converge on the strongest invariant. It 
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is unclear whether the method terminates with an invariant for programs not 
meeting these conditions. 

Miiller-Olm and Seidl generate invariant polynomial equalities of bounded 
degree by backward propagation [23]. Their approach is limited to programs 
with linear assignments and ignores all guards. However, their analysis is based 
on methods from linear algebra and can be performed in both space and time 
that is polynomial in the number of program variables. The approaches based 
on Grobner bases, on the other hand, have space and time complexity that is 
potentially exponential in the number of variables [28] . 



Contribution. We present a method which approximates the algebraic rela- 
tional semantics by abstract interpretation in the lattice of bounded degree poly- 
nomial pseudo ideals - vector spaces of polynomials of degree no greater than 
some bound d which are closed under monomial products of degree no greater 
than d. For a fixed bound, the space required to represent pseudo ideals and the 
number of iterations required to converge on fixed points are both bounded by 
a polynomial in the number of program variables. 

Our method builds upon the work of Karr on generating invariant linear 
equalities [15], and can be seen as a hybrid of the Grobner basis methods and 
the method of Miiller-Olm and Seidl. Like the Grobner methods, our method 
achieves increased precision due to its firm grounding in the theory of polynomial 
ideals. Like the method of Miiller-Olm and Seidl, our method gains space and 
time efficiency by using representations and algorithms from linear algebra. We 
have implemented our method and have applied it to a number of programs 
taken from the literature on non-linear polynomial invariant generation. For 
these programs, our method produces results as precise as those produced by 
other methods. 

2 Preliminaries 

Let V = {a;i, . . . , Xn} be a finite set of variables. A state cr : K ^ Q maps each 
variable to a rational, and S denotes the set of all states. A state formula (p is 
a first-order expression whose free variables belong to K. A state cr satisfies (p, 
denoted a\= precisely when ip holds in a. The set of all states satisfying p is 
denoted |(p]. 

A (binary) relation p on A" is a subset of S x E. Each pair (cr, a') € p consists 
of a prestate a and a poststate a' . The composition p\ o p 2 of two relations is the 
set of pairs (cr, a') such that (ct, a) G pi and {a, a') G p 2 for some intermediate 
state a. The set of all relations on E \s TZ. A relation formula ip is a, first- 
order expression whose free variables belong to K U V, where the variables in V 
denote the values in the prestate, and those in V = {x(, . . . , x'^} denote values 
in the poststate. A pair (cr, cr') satisfies ip ((cr, cr') \= ip) if ip holds in the model 
which interprets K as in the prestate and V as in the poststate. A relation p 
satisfies ip if (cr, cr') ^ ip for every (cr, a') G p, and |^/’] denotes the largest relation 
satisfying ip. 
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A program V = {L,T,Li,Lf) consists of a finite set of locations L; a finite 
set of transitions T, where each t G T is a tuple {£,£', p) consisting of a pre- 
location i, a postlocation , and a relation p, called the transition relation] a 
subset Li C L of initial locations] and a subset Lf C L of final locations. A path 
IT = ■ ■ -^n-i £tc^ ... of a program is a potentially infinite sequence of 

interleaved locations and transitions such that, for each i > 0, £i is the preloca- 
tion of Ti and £i+i is its postlocation. A finite path £o^ . . . is proper iff 

£o G Li and G Lf. The path relation [tt] of a finite path tt is the composition 
of the transition relations along tt. 

Definition 1 (Relational Semantics) The program relation |7^] is the union 
of the path relations of the proper paths ofV. The program relation is also known 
as the input-output relation. 

A cycle is a finite path £q^ . . . £q which begins and ends at the same 
location. Since acyclic programs contain only finitely many paths, the difficulty 
of program analysis lies in programs containing cycles. Let 7^ be a program with 
cycles, and suppose that V contains a single cut point c, i.e., a location contained 
in every cycle of 7^ ^ . Call a path tt simple if c appears only as the initial or final 
location, and complex otherwise. Then the proper paths of V can be partitioned 
into a set TTg of simple paths and a set of complex paths. Let iTj be the set of 
simple paths from an initial location to c, 77c be the set of simple cycles from c, 
and 77f be the set of simple paths from c to a final location. Then, 

17^1 = [77,1 U (poo o I77fl) , 

where poo is the least fixed point of the following function on relations: 

/(p) = l77i]U(po|77cl). 

Thus, the problem of computing the relational semantics of a program with 
cycles reduces to that of computing least fixed points in TZ. 



Abstract Interpretation. A lattice (L, C, □, U) is a nonempty partially or- 
dered set in which every pair of points pi,P 2 G L has a greatest lower bound 
Pi n p 2 (meet) and a least upper hound p\ U p 2 {join). A lattice is complete if 
every subset of L has a greatest lower and a least upper bound in L. Every finite 
lattice is complete, and every complete lattice possesses a least element T and 
a greatest element T. The dual of (L, C,n,U) is the lattice (L, □,□,□) [9]. 

An ascending chain pi Q P 2 E • ■ • E Pn E ■ • ■ and a descending chain 
Pi E P 2 E ■ • ■ E Pn E ■ • ■ are potentially infinite sequences of ordered points. 
A chain eventually stabilizes iff there is an i such that pj = pi for all j > i. 
A lattice satisfies the ascending (descending) chain condition if every infinite 
ascending (descending) chain eventually stabilizes. Every lattice satisfying both 
chain conditions is complete. A function / : L — > L on a lattice is monotone if 

^ The approach extends to sets of cut points. 
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Pi C p 2 implies f{pi) E f{P 2 ), and a point p with p = f{p) \s a, fixed point of /. 
In a lattice satisfying both chain conditions, the least fixed point lfp(/) and the 
greatest fixed point gfp(/) can be computed iteratively: 

ifp(/) = u gfp(/) = n /'(T) 

i >0 i>0 

When the lattice L does not satisfy the chain conditions, there are essen- 
tially two approaches to approximating fixed points: The first involves devising 
a widening to extrapolate the limits of unstable chains, while the second entails 
constructing another lattice M, along with a Galois connection (a, 7 ) from L 
to M, i.e., a pair of monotone functions a : L ^ M, ^ : M ^ L satisfying 
I E l{c({l)) for all I G L and 0 ( 7 ( 771 )) E ni for all m G M. The Galois connec- 
tion ensures that fixed points in L can be approximated by corresponding fixed 
points in M [4, 6 ] . 



Polynomial Ideals. A monomial in variables xi, . . . , is a product of powers 
a;® = Xi'^ ■ ■ ■ . A polynomial p is a linear combination cix"^^ -I — • -|- of 

monomials with rational coefficients. The polynomial ring Q[V) V] is the set of 
all polynomials in V and V . The degree deg(a;^^ • • ■x'f") of a monomial is the 
sum c?i -I- • • • -I- d„. The degree deg(p) of a polynomial p is the maximal degree 
of its monomials with non-zero coefficients. A polynomial is linear if its degree 
is one, quadratic if two, and cubic if three. A set of polynomials P is said to be 
of degree d if deg(p) < d for every p G P. 

The largest relation p on S for which p \= p = 0 for all p in a set P of 
polynomials is denoted |P]. A relation p is algebraic iff p = |P] for some set of 
polynomials P. Given a relation p, the theory Th{p) of p is the set of polynomials 
that vanish on p, i.e., {p | p ^ p = 0}. All theories are polynomial ideals. 

Definition 2 (Polynomial Ideal) A set I of polynomials from Q[V,fo^] is an 
ideal iff i) Pi + P 2 & I for all pi,P 2 G I and ii) qp G I for all p G I and q G 
Q[E, V']. The generated ideal Td{P) of a subset P ofQ[V, V] is the set of linear 
combinations qipi -I- • • • -I- qmPm, with pi, . . . ,Pm G P and qi, . . . ,qm G Q[V, V']. 
A set P is a basis of an ideal I iff I = Td{P). 

The least ideal is {0}, and the greatest is Q[E, E']. Ideals are closed under 
intersection, but not union. The sum Ii + I 2 of two ideals is the least ideal 
which contains them. The set X of all ideals of Q[E, V] forms a complete lattice 
(X, C, n, -I-), and (77i, |-]) forms a Galois connection from TZ to the dual of X. 

By the Hilbert basis theorem, every polynomial ideal has a finite basis [8]. 
A special family of bases, known as reduced Grobner bases, provides canonical 
representations of polynomial ideals, can be constructed using Buchberger’s al- 
gorithm, and is suitable for implementing lattice operations [1, 11]. However, the 
size of a Grobner basis can be exponential in the number of variables [28]. 

Vector Spaces. A vector v : X ^ Q maps elements of a finite index set 
X to rationals. A vector space is a set of vectors closed under addition and 
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multiplication by scalars. The generated space Sp{V) of a set of vectors is the set 

of linear combinations qiV\-\ h qmVm , where v\, . . . ,Vm &V and q\,. . . ,qm G 

Q. Let S be the set of spaces of vectors indexed by X. Then (5, C, n, +) forms a 
lattice satisfying both chain conditions. A basis of a space S' is a minimal set of 
vectors V with S = Sp(V), and canonical bases of vector spaces can be found in 
polynomial time using Gauss-Jordan reduction [26]. All bases of a vector space 
S are of the same size - the dimension dim(S) of the space. The dimension of a 
space of vectors indexed by X never exceeds jXj, and all chains of such spaces 
contain no more than |A| + 1 distinct points [17]. 

3 Approximating the Algebraic Relational Semantics 

Our method approximates the relational semantics of algebraic programs, i.e., 
programs all of whose transition relations are algebraic. Algebraic programs can 
model imperative programs over rational variables constructed using assignment, 
alternation and iteration, provided assignments are polynomials in the program 
variables and the conditions of if and while statements are conjunctions of poly- 
nomial equalities. Programs with integer variables can be modeled by treating 
those variables as rational, but programs with integer division must be afforded 
special treatment to ensure the soundness of the analysis: For example, the as- 
signment X := X div 2 can be modeled by the relation \x' = ^x\ only if x is known 
to be even. This property can often be established using other static analyses, 
e.g., congruence analysis [13]. Programs not meeting these conditions can be an- 
alyzed by first constructing algebraic abstractions, e.g., treating non-polynomial 
assignments as non-deterministic and modeling non-algebraic conditions by the 
identity relation. 

One approach to approximating the algebraic relational semantics, suggested 
by the Galois connection (77i, |-]) from TZ to the dual of I, is abstract interpre- 
tation in the lattice of polynomial ideals. However, in addition to the high com- 
plexity of operations on ideals, the iterative method of computing fixed points 
is hampered by the presence of unstable descending chains. For example, an 
analysis of the program presented in Fig. 2 would visit the chain 

Id{{x — 1}) Z) Jd{{{x — l)(x' — 2)}) D ... D Id ~ ^ j 

where multiplication is used to express disjunction. The method we present 
performs abstract interpretation in the lattice of polynomial pseudo ideals of 
bounded degree, a lattice satisfying both chain conditions. 



Polynomial Pseudo Ideals. Given a relation p and a non-negative integer 
d, let the theory Thd{p) of degree d be the set of polynomials p in Th{p) such 
that deg(p) < d. If we treat polynomials as vectors indexed by monomials, then 
Thd{p) is a vector space. A polynomial space of degree d is a vector space indexed 
by the set of monomials of degree no greater than d. For a fixed degree d, the 
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var X : integer; 

£i : X ~ 1; 

£2 : while true do 
£3 : a; := * + 1; 

£4 : halt 

Fig. 2. Descending a chain of ideals 



sizes of bases of polynomial spaces and the number of distinct points in chains 
are bounded by a polynomial function in the number of variables n: The number 
of monomials inVUV' of degree d or less is ((^”(^^)) ~ the number of multisets of 
size d chosen from 2n+ 1 elements, where the additional element, i.e., 1, allows 
for monomials of degree less than d. Now, = 0[{2n+d)'^), 

which is polynomial in n for a fixed d. Thus, iterative computation of fixed points 
in the lattice of polynomial spaces of degree d is feasible - provided d is small, 
e.g., two or three. 

This lattice, however, contains extraneous points which can impede precise 
analysis. Consider the relation \x + y = Q A x' = + xy A y' = y\. This re- 

lation satisfies x' = 0, but the polynomial x' is not in the space generated 
by P = {x + y,x' — x^ — xy,y' — y}. To improve the accuracy of the analy- 
sis, linear combinations with polynomial coefficients must be considered: x' = 
x(x + y) + {x' — x^ — xy). However, allowing the use of arbitrary polynomial 
coefficients would lead, once again, to the lattice of polynomial ideals. Instead, 
we limit the polynomial coefficients which can appear in a linear combination 
by bounding the degree of the terms appearing in that combination. 

Definition 3 (Polynomial Pseudo Ideal) A polynomial space J of degree d 
is a polynomial pseudo ideal of degree d iff qp G J for all p £ J, q £ Q[C, V] 
with deg{qp) < d. The generated pseudo ideal Vsd{P) of a set of polynomials of 
degree d is the least pseudo ideal of degree d containing P. 

While pseudo ideals of degree d are closed under intersection, they are not 
closed under sum. Thus we define the join Ji 1+)^ J 2 as Vsd{Ji U J 2 ). The set Jd 
of pseudo ideals of degree d forms a lattice (Jd, C, n, Wd). While not a sublattice 
of the lattice of polynomial spaces, all chains in this lattice are chains of spaces, 
and thus this lattice satisfies both chain conditions. For any relation p, Thd{p) 
is a pseudo ideal of degree d\ If p\ , p 2 vanish on p and are of degree no greater 
than d, then pi + p 2 vanishes on p and deg(pi -I-P 2 ) < d. If p vanishes on p and 
deg(p) < d, then for any q with deg(gp) < d, qp also vanishes on p. Furthermore, 
'^d{\P\) 2 Psd{P) for any set P of polynomials of degree no greater than d. 
In other words, moving to the generated pseudo ideal is always sound, but not 
necessarily complete. Based on these observations, the following result can be 
established: 

Lemma 1 For any d > 0, {Thd, |-]) forms a Galois connection from TZ to the 
dual of Jd- 
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Polynomial pseudo ideals are not merely the restrictions of ideals to polyno- 
mials of bounded degree. Due to the possibility of canceling high-degree mono- 
mials while taking linear combinations of polynomials, the ideal / generated by 
a set P of polynomials of degree d generally contains more polynomials than the 
pseudo ideal J generated by P, even if we consider only polynomials of degree 
no greater than d. There can exist a polynomial p with deg(p) < d for which the 
only linear combinations qipi +• • ■ + qmPm of P generating p, have deg(qiPi) > d 
for some i G m}. In such a case, p G I, but p G J need not hold. 

In the worst case, the linear combinations establishing membership of the 
polynomial p in the ideal Td{P) can all have terms whose degrees are doubly 
exponential in the number of variables, even when the degrees of p and the ba- 
sis P are fixed [28]. As a result, the membership problem for polynomial ideals 
requires exponential space [20]. In contrast, membership in the pseudo ideal gen- 
erated by P can be decided in polynomial time. Pseudo ideals provide tractable 
approximations to polynomial ideals, and the precision of the approximation can 
be improved by increasing the degree bound. 

For the remainder of this section, we assume a fixed degree bound d. 



Representing Pseudo Ideals. We represent pseudo ideals by their basis as 
polynomial spaces. A pseudo ideal J is represented by a set of polynomials P 
such that J = Sp{P). Such a set P always exists, since pseudo ideals are spaces. 
Although often containing redundancy, this representation allows us to apply 
well known methods from linear algebra to solve problems on pseudo ideals. 

Essential to our representation of pseudo ideals is an algorithm to compute 
the pseudo ideal hull: Given a set of polynomials P of degree d, find a set Q such 
that the polynomial space generated by Q is the pseudo ideal generated by P. 
Before presenting our algorithm, we state, without proof, the following result: 

Lemma 2 A space S of degree d is a pseudo ideal iff xp G S for every variable 
X GV yjV and every polynomial p G S with deg(p) < d — 1. 

Based on this lemma, it is easy to decide whether a given space S' is a 
pseudo ideal. We first compute a basis P = {pi, . . . ,pm} of the subset Sd-i of 
polynomials in S of degree no greater than d — 1 by eliminating all monomials of 
degree d using Gauss-Jordan reduction. Then we verify that xp is in S for every 
p G P and X G V \dV' . Since any polynomial p G Sd-i is a linear combination 
qiPi -I- • • • -I- qmPm of P, this condition is both necessary and sufficient for xp to 
be a member of S for an arbitrary p G Sd-\. Should we discover that S is not a 
pseudo ideal, its pseudo ideal hull can be computed iteratively as the least fixed 
point of the following monotone function on polynomial spaces: 

/(T) = AU {xp I p G Td-i,x GVU V'} 

Computing Lattice Operators. Membership of a polynomial p in the pseudo 
ideal J = Sp{P) can be decided using Gauss-Jordan reduction: p G J iS the 
canonical form of P is identical to that of P U {p}. Similarly, the pseudo ideal 
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Ji = Sp{Pi) is contained in J2 = Sp{P2) iff the canonical form of Pi U P2 
is identical to that of P2- The intersection of pseudo ideals can be computed 
using Karr’s algorithm [ 15 ], and the union of pseudo ideals J\ = Sp{Pi) and 
J2 = Sp{P2) can be computed by taking the hull of P\ U P2- 

To approximate the composition |Ji] o IJ2], we introduce a new operator 
on pseudo ideals, Ji J2, computed using the following algorithm: Suppose 
Ji = Sp{Pi) and J2 = Sp{P2). We switch temporarily to the polynomial ring 
Q[y,K, W], where V denotes the variable values in intermediate states. In this 
ring, we compute the pseudo ideal hull oi Q = Pi\V' ^ V][J P2\V — > K], where 
Pi\U W] denotes substitution of the variables in U by the corresponding vari- 
ables in W . We then apply Gauss-Jordan reduction to eliminate all monomials 
in which a variable of V appears. The resulting set P of polynomials forms the 
basis of Ji J2- By construction, any tuple (ct, ct,(t') satisfying (ct, ct) G |Pi] 
and (ct,ct') G IP2] causes every polynomial in Q to vanish, where the variables 
V are interpreted as in a. Since P consists of linear combinations of polynomials 
of Q in which the variables of V do not appear, (cr, a') G |F]. Therefore, J\ *d J2 
soundly approximates the composition, i.e., 77 id(|Ji] o IJ2]) 2 Ji »d J2- 



Example. We now illustrate our method by applying it to the program of 
Fig 1 (a), using quadratic pseudo ideals. The first step is to approximate the 
relational semantics to the cut point ^2■ The precise relational semantics are 
given by the least fixed point of the following function: 



f{p) = di U {po P2), 

where p\ = \x' = x A y' = 0 A z' = 0 ] is the relation from li to ^2, and p2 = 
\x' = X A y' = y +1 A z' = z -I- 2 y -|- 1 ] is the relation of the cycle from £2 back 
to £2- To approximate lfp(/), we apply the Galois connection ( 77 i 2 , |-]) from TZ 
to the dual of J2 and compute the greatest fixed point of g, 



g{J) — Ji n (J *2 J2), 



where Ji = Vs2{{x' — x,y' , z'}) approximates the relation pi, while the rela- 
tion p2 is approximated by J2 = Vs2({x' — x,y' — y — 1 , z' — z — 2 y — 1 }). The 
soundness of the analysis follows from the fact that 77 i 2 (/(|J])) 3 y(J) for any 
quadratic pseudo ideal J. Iterating from T converges on gfp(y): 



Joo — 



2 ! /^2 / II t !\ 

: — [x ) , xy — yx, xz — zx, xx — [x ) 
xy' — x'y', xz' — x'z', {y')^ — z', x — x' 



A2 ' 



The second step of the analysis computes the approximation J of the input- 
output relation as the composition of Joo and the quadratic pseudo ideal ap- 
proximating the path relation from £2 to £a. 
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var x,y, z : integer; 
ii: Z--0 

£2 '■ while y ^ 0 do 

£3 : if y mod 2 = 1 then 

£4 ■■ {x, y, z) ■— {2x, (y - 1) div 2,x + z) 
else 

£5 : {x,y,z) := {2x,y div 2,z)\ 

£% : halt 

Fig. 3. Product by Binary Decomposition 



var a, b,p,q : integer; 

£1 : {p,q) := (1,0); 

£2 : while a yf 0 A 6 yf 0 do 

£3 : if a mod 2 = 0 A 6 mod 2 = 0 then 

£4 : {a, b, p, q) := (a div 2, b div 2, 4p, q) 
else if a mod 2 = 1 A 6 mod 2 = 0 then 
£5 : {a,b,p,q) := {a - l,b,p,q + bp) 
else if a mod 2 = 0 A 6 mod 2 = 1 then 
£e : {a,b,p,q) := {a,b - l,p,q + ap) 
else 

£7 : {a,b,p,q) := {a - l,b - l,p,q + {a + b - l)p) 

£s : halt 



Fig. 4. Alternate Product 



J = Sp 



/ r — z', xy — yy', xz — zy[ xx' — 



— z, xy — yy, xz — Zy', Xx' — z\ Xy' — z\ Xz' — ; 

/ II / / /\2 III III / 

yx - yy, zx - zy, [X ) -z, xy-z, xz-y 
a,A2 _ j ^ 






y z, 

/ 

V z 



The representation of a pseudo ideal by its basis as a polynomial space is conve- 
nient, but frequently redundant. Simplifying to a pseudo ideal basis of J gives a 
more perspicuous description of the relational semantics: 



\x' = X A y' = X A z' = 



4 Applications 

To demonstrate the utility of our approach, we have implemented the method in 
Java and applied it to several programs taken from the literature on non-linear 
invariant generation. For each of the programs we have examined, our analysis 
produces results as precise as those produced by other methods [25,24]. 
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var xi,X2,yi,y2 ■ integer; 

£i ■■ (j/ 1 , 2 / 2 ) := (0,a:i); 

£2 : while j/2 > X2 do 

is ■■ (j/i, 2 / 2 ) := (yi + 1, y 2 - X2); 

£4 : halt 

Fig. 5 . Division by Subtraction 



Lacking the ability to express inequalities between variables, algebraic rela- 
tions are frequently incapable of adequately capturing the input-output behavior 
of the programs we consider. This limitation, however, can often be overcome 
by combining our analysis with linear relation analysis [7, 14]. 



Product by Binary Decomposition. Our first example, taken from 
Rodriguez Carbonell and Kapur [24], is shown in Fig. 3. It is a variant of a 
program for exponentiation by binary decomposition studied by Manna [18]. 
Our method computes the relational semantics as 

ly' = 0 A z' = xy]. 

The analysis takes 90ms to complete^. The soundness of the analysis depends 
on the fact that y — 1 is even at £4 and y is even at £ 5 . These properties can be 
established automatically by Granger’s congruence analysis [13]. 



Alternate Product. Our second example, also from Rodriguez Carbonell and 
Kapur and shown in Fig. 4, demonstrates the need to work with pseudo ideals of 
degree greater than two. Conducting the analysis in the lattice of cubic pseudo 
ideals, our method approximates the relational semantics to cut point £2 as 

W = ab- a' b'p'j, 

and the semantics of the program as 

lab' = 0 A q = a6]. 

Note that the fixed point at £2 is a cubic relation, while the input-output relation 
of the program is quadratic. Had we approximated the semantics using quadratic 
pseudo ideals, the result of the analysis would not have been as precise. By 
working with cubic rather than quadratic pseudo ideals, we increase the accuracy 
of our analysis, along with the space used and the time taken (2.2s). 



^ Reported times are for a 1.4GHz Pentium with 512MB running NetBeans 3.5.1. 
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var xi,X2,yi,y2,y3,V4. ■ integer; 
ii ■■ (yi,2/2,2/3,J/4) := (a;i,a;2,l,0); 

€2 : while yi > 1/2 do 

is ■■ {y2,ys) ■- {2y2,2ys)\ 

€4 : while true do 

■fs : if j/i > 2/2 then 

i& ■■ (j/ 1 , 2 / 4 ) := {yi - 2 / 2,214 + ys)\ 
€7 : if 2/3 = 1 then 
fs : halt; 

Jq ■■ (2/2, 2/3) := (i/2 div 2, 1/3 div 2) 
Fig. 6 . Hardware Integer Division 



Division by Subtraction. The program in Fig. 5 also appears in Sankara- 
narayanan et al. [25]. Returning to quadratic pseudo ideals, our method com- 
putes the following relation in 190ms: 

\x\ =Xi ^ x'^ =X 2 ^ Xi= y[x 2 + y '21 

Assuming that x\ is initially non-negative and X 2 is positive, linear relation anal- 
ysis [7, 14] can discover the invariant Q < y '2 < x '2 ^ 4 . Thus, upon termination, 

2/1 holds the quotient of x\ and X 2 , while 2/2 holds the remainder. 

Hardware Integer Division. Our next example, also from Manna and ap- 
pearing in Sankaranarayanan et al., is shown in Fig. 6 . This program computes 
the quotient and remainder of x\ and X 2 by binary search. The fact that 2/3 is 
a positive power of 2 at £ 9 , which appears to be derivable by numerical power 
analysis [19], combined with the invariant 2/2 = X 2 ys, generated by our method, 
guarantees that both 2/2 and 2/3 are even at fg. 

Our analysis approximates the relational semantics in 700ms, producing 

{x[ =Xi A X2= X2 A y'2=X2 A 2/3 = 1 A Xi = 3:22/4 + Vil- 

To guarantee correctness of the program, i.e., guarantee that it produces the 
quotient and remainder, the additional property 0 < y[ < X 2 is required. 

Dijkstra’s Square Root. Fig. 7 shows a program, due to Dijkstra [10], for 
computing the integer square root of a non-negative integer n. The fact that q 
holds a positive power of 4 at £5 and the loop invariant = qn — qr ensure the 
soundness of treating these variables as rational. 

In 230ms, our analysis approximates the relational semantics as 

In' = n A n= {p'Y + r' A g' = 1]. 

Applying linear relation analysis again, it is possible to infer that 0 < r' < 2p'-|-l 
at location £ 9 . Thus, the program halts with (p')^ < n < (p' -I- 1)^. 
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var n,p, q,r,h : integer; 
h : {p,q,r) := (0, l,n); 

I2 : while q < n do 
is-q — 4(j; 

€4 : while q 1 do 

£5 : q ~ q div 4; 

4 : (h,p) ■- {p + q,p div 2) 

£7 : it r > h then 

£s ■■ {p, r) := {p + q,r - h) 

£q : halt 

Fig. 7 . Dijkstra’s Square Root 
var x,yi,y 2 ,ys : integer; 

£1 ■■ {yi,y2,y3) ~ (o, i, i); 

£2 : while ^2 < a: do 

4 : (j/i, 2/2, 2 / 3 ) := (2/1 + 1, 2/2 + 2/3 + 2, ^3 + 2); 

£4 : halt 

Fig. 8. Integer Square Root 

Integer Square Root. The program in Fig. 8, taken from Manna, computes the 
integer square root of a non-negative integer x. Our analysis yields the following 
relation in 240ms: 

lx' = x A y '2 = {y'lf + 2y'^ + 1 = {y'l + 1)^ A y'^ = 2y'^ + 1] 

Linear relation analysis can be used to infer y '2 = (j/( -I- 1)^ > a; at £ 4 . Thus, 
while the final value of yi cannot be too small, it might be too large. 

A more refined analysis results from splitting the proper paths into those in 
which the loop does not execute and those for which it iterates at least once. In 
the former case, the input-output relation is 

|1 > a: A x' = X A = 0 A 2/2 = 1 2/2 = 1]- 

Since x is non-negative, a; = 0 and y'l = x^. In the latter case, the loop condition 
must hold for the penultimate iteration of the loop, i.e., y '2 — y '3 < x, giving 

liy'i)^ <X < (2/i + 1 )^ A x' = X A y'2 = {y'l + if A 2/3 = ‘^y'l + I]- 

In either case, 2/1 is the integer square root of x. 

5 Conclusion 

We have presented a static analysis which approximates the algebraic relational 
semantics of imperative programs by abstract interpretation in the lattice of 
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var X : integer; 

: if 7 ^ 0 then 
£2 '■ X ~ 0; 

£3 : halt 

Fig. 9. Incompleteness of ideals 



polynomial pseudo ideals of a given degree. For a fixed degree, the space re- 
quired to represent points in this lattice and the number of iterations needed to 
converge on fixed points are both bounded by a polynomial in the number of 
program variables. Our work continues the tradition of using linear reasoning to 
infer consequences of non-linear constraints [2], and our method is incomplete 
relative to an abstract interpretation in the lattice of polynomial ideals. However, 
our method is tractable, while the complexity of abstract interpretation using 
polynomial ideals can be exponential in the worst case [20,28]. Furthermore, 
for a number of programs drawn from the literature on non-linear polynomial 
invariant generation, our method produces results as precise as those produced 
by Grobner basis methods. 

In any event, while abstract interpretation using polynomial ideals provides 
a reasonable measure of relative completeness, it too is incomplete. All algebraic 
theories are radical, but the lattice of polynomial ideals contains non-radical 
points^. As a result, an analysis in the lattice of ideals will fail to deduce that the 
program of Fig. 9 halts with x equal to zero. The analysis of this program should 
be conducted using radical ideals, and the example suggests that the precision 
of our method would be improved by closing pseudo ideals under radicals. 

Our decision to approximate the relational semantics in a lattice of polyno- 
mial pseudo ideals rather than devise a widening for the lattice of ideals was a 
pragmatic one, driven by the computational complexity of operations on ideals. 
For some applications, however, the increased precision afforded by abstract in- 
terpretation in the lattice of ideals justifies the cost. In these cases, the family 
of lattices of polynomial pseudo ideals can serve as the basis of a widening for 
the lattice of ideals [6]. For example, we can iterate for a finite number of steps 
using polynomial ideals, then move to pseudo ideals of degree d, where d is the 
maximal degree of a polynomial appearing in a Grobner basis. Alternatively, we 
can use the monomials appearing in a Grobner basis along with their factors to 
construct the index set of a polynomial space and move to the corresponding 
lattice of pseudo ideals. In other words, we can base the choice of lattice on the 
particular monomials which are observed, and not simply their degrees. 
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Abstract. An interesting area in static analysis is the study of numeric proper- 
ties. Complex properties can be analyzed using abstract interpretation, provided 
that an adequate abstract domain is defined. Each domain can represent and ma- 
nipulate a family of properties, providing a different trade-off between the preci- 
sion and complexity of the analysis. The contribution of this paper is a new nu- 
meric abstract domain called octahedron that represents constraints of the form 
{ixj ± . . . ± ife > c), where Xi are numerical variables such that Xi > 0. The 
implementation of octahedra is based on a new kind of decision diagrams called 
Octahedron Decision Diagrams (OhDD). 



1 Introduction 

Abstract interpretation [5] defines a generic framework for the static analysis of dy- 
namic properties of a system. This framework can be used, for instance, to analyze 
termination or to discover invariants in programs automatically. However, each analysis 
requires the framework to be parametrized for the relevant domain of properties being 
studied, e.g. numerical properties. 

There is a wide selection of numeric abstract domains that can be used to represent 
and manipulate properties. Some examples are intervals, octagons and convex polyhe- 
dra. Each domain provides a different trade-off between the precision of the properties 
that can be represented and the efficiency of the manipulation. An interesting prob- 
lem in abstract interpretation is the study of new abstract domains that are sufficiently 
expressive to analyze relevant problems and allow an efficient implementation. 

In this paper, a new numerical abstract domain called octahedron is described. This 
abstract domain can represent conjunctions of restricted linear inequalities of the form 
{±Xj ± ... ± Xk > c), where Xi are numerical variables such that Xi > 0. A new kind 
of decision diagram called Octahedron Decision Diagram (OhDD) has been specifi- 
cally designed to represent and manipulate this family of constraints efficiently. Several 
analysis problems can be solved using these constraints, such as the analysis of timed 
systems [1, 12], the analysis of string length in C programs [8] and the discovery of 
bounds on the size of asynchronous communication channels. 

The remaining sections of the paper are organized as follows. Section 2 explains re- 
lated work in the definition of numeric domains for abstract interpretation, and previous 
decision diagram techniques used to represent numerical constraints. Section 3 defines 
the numeric domain of octahedra, and section 4 describes the data structure and its op- 
erations. In section 5, some possible applications of the octahedron abstract domain are 
discussed, and some experimental results are provided. Finally, section 6 draws some 
conclusions and suggests some future work. 

R. Giacobazzi (Ed.): SAS 2004, LNCS 3148, pp. 312-327, 2004. 

@ Springer- Verlag Berlin Heidelberg 2004 
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Table 1. A comparison of numeric abstract domains based on inequality properties. 



Abstraction 


Cite 


Properties 


Example 


Intervals 


[5] 


ki < X < k2 


2 < a; < 5 


Difference Bound 
Matrices (DBMs) 


[V, 15] 


ki < X < k2 
X — y < k 


1 < a; < 3 
X — y < 5 


Octagons 


[16] 


±x iiy < k 


2 < X y < 6 


Two variables per inequality 


[22] 


C\ ■ X\ C2 ■ X2 > k 


2<3x-2y<5 


Octahedra 


This paper 


±Xi ± . . . ± Sfc > fc 


X — y z > 5 


Convex polyhedra 


[6,11] 


Cl ■ Xi + . . . -y c„ ■ x„ > k 


X 3y — 2z > 6 



2 Related Work 

2.1 Numeric Abstract Domains 

Abstract domain is a concept used to denote a computer representation for a family of 
constraints, together with the algorithms to perform the abstract operators such as union, 
intersection, widening or the transfer function. Several abstract domains have been de- 
fined for interesting families of numeric properties, such as inequality or modulo prop- 
erties. The octahedron abstract domain belongs to the former category. Other abstract 
domains based on inequalities are intervals, difference bound matrices, octagons, two- 
variables-per-inequality, and convex polyhedra. An example of these abstract domains 
and their relation to octahedra can be seen in Table 1 . 

Intervals are a representation for constraints on the upper or lower bound of a single 
variable, e.g. {k\ < x < k 2 ). Interval analysis is very popular due to its simplicity and 
efficiency; an interval abstraction for n variables requires 0{n) space, and all opera- 
tions require Off) time in the worst case. Octagons are an efficient representation for 
a system of inequalities on the sum or difference of variable pairs, e.g. ffx ± y < fc) 
and ix < k). The implementation of octagons is a based on difference bound matrices 
(DBM), a data structure used to represent constraints on differences of variables, as in 
(x — y < k) and {x < k). Efficiency is an advantage of this representation: the spatial 
cost for representing constraints on n variables is 0{n?), while the temporal cost is 
between O(n^) and O(n^), depending on the operation. Convex polyhedra are an ef- 
ficient representation for conjunctions of linear inequality constraints. This abstraction 
is very popular due to its ability to express precise constraints. However, this precision 
comes with a very high complexity overhead. This complexity has motivated the defi- 
nition of abstract domains such as two-variables per inequality, which try to retain the 
expressiveness of linear inequalities with a lower complexity. 

The abstract domain presented in this paper, octahedra, also attempts to keep some 
of the flexibility of convex polyhedra with a lower complexity. Instead of limiting the 
number of variables per inequality, the coefficients of the variables are restricted to 
{ — 1,0, -1-1}. From this point of view, octahedra provide a precision that is between 
octagons and convex polyhedra. 

2.2 Decision Diagrams 

The implementation of octahedra is based on decision diagrams. Decision diagram tech- 
niques have been applied successfully to several problems in different application do- 



314 



Robert Clariso and Jordi Cortadella 



mains. Binary Decision Diagrams (BDD) [3] provide an efficient mechanism to repre- 
sent boolean functions. Zero Suppressed BDDs (ZDD) [14] are specially tuned to rep- 
resent sparse functions more efficiently. Multi-Terminal Decision Diagrams (MTBDD) 
[10] represent functions from boolean variables to reals, / : B” ^ H 

The paradigm of decision diagrams has also been applied to the analysis of numer- 
ical constraints. Most of this approaches compare the value of numeric variables with 
constants or intervals, or compare the value of pairs of variables. Some examples of 
these representations are Difference Decision Diagrams (DDD) [17], Numeric Decision 
Diagrams (NDD) [9], and Clock Difference Diagrams (CDD) [2]. These data structures 
encode contraints on a maximum of two variables at a time. In other representations, 
each node encodes one complex constraint like a linear inequality. Some examples of 
these representations are Decision Diagrams with Constraints (DDC) [13] and Hybrid- 
Restriction Diagrams (HRD) [24]. The Octahedron Decision Diagrams described in 
this paper use an innovative approach to encode linear inequalities. This approach is 
presented in Section 4. 

3 Octahedra 

3.1 Definitions 

The octahedron abstract domain is now introduced. In the same way as convex poly- 
hedra, an octahedron abstracts a set of vectors in Q" as a system of linear inequalities 
satisfied by all these vectors. The difference between convex polyhedra and octahedra 
is the family of constraints that are supported. 

Definition 1 (Unit linear inequality). A linear inequality is a constraint of the form 
(ci ■ xi Cn ■ Xn > k) where the constant term k and the coefficients Ci are in 

(Q U {— oo}, e.g. (3a; + 2y — z > —7). A linear inequality will be called unit if all 
coefficients are in { — 1, 0, +1}, such as {x y — z > —7). 

Definition 2 (Octahedron). An octahedron O over (Q" is the set of solutions to the sys- 
tem ofm unit inequalities 0 = {X \ AX > B A X >0”}, where (Q U {— oo})™ 
and A € {—1, 0, 4-1}™^”. Octahedra satisfy the following properties: 

1. Convexity: An octahedron is a convex set. 

2. Closed for intersection: The intersection of two octahedra is also an octahedron. 

3. Non-closed for union: The union of two octahedra might not be an octahedron. 

Figure 1(a) shows some examples of octahedra in two-dimensional space. In Fig. 
1(b) there are several regions of space which are not octahedra, either because they 
contain a region with negative values (1), they are not convex (2), they cannot be repre- 
sented by a finite system of linear inequalities (3), or because they can be represented 
as system of linear inequalities, but not unit linear inequalities (4). Notice that in two- 
dimensional space all octahedra are octagons; octahedra can only show a better preci- 
sion than octagons in higher-dimensional spaces. 

During the remaining of this paper, we will use C to denote a vector in {— 1,0, 4-1}” 
where n is the number of variables. Therefore, > k) denotes the unit linear 

inequality (ci • xi 4- . . . 4- c„ • > A). 



The Octahedron Abstract Domain 



315 




Fig. 1. Some examples of (a) octahedra and (b) non-octahedra in two-dimensional space. 

Lemma 1. An octahedron over n variables can be represented by at most 3” non- 
redundant inequalities. 

Proof. Each variable can have at most three different coefficients in a unit linear in- 
equality. These means that if an octahedron has more than 3" unit inequalities, some of 
them will only differ in the constant term, e.g. {C'^X > ki) and {C'^X > ^ 2 ). Only 
one of these inequalities is non-redundant, the one with the tightest bound (the largest 
constant), i.e. > max(fci, ^ 2 ))- □ 

A problem when dealing with convex polyhedra and octahedra is the lack of canon- 
icity of the systems of linear inequalities; the same polyhedron/octahedron can be repre- 
sented with different inequalities. For example, both (a; = 3) A (y > 5) and (a; = 3) A 
(a; + y > 8) define the same octahedron with different inequalities. Given a convex 
polyhedron, there are algorithms to minimize the number of constraints in a system of 
inequalities, i.e. removing all constraints that can be derived as linear combinations. 
However, in the previous example both representations are minimal and even then, they 
are different. Given that the number of possible linear inequalities in a convex polyhe- 
dron is infinite, the definition of a canonical form for convex polyhedra seems a difficult 
problem. However, a canonical form for octahedra can be defined using the result of 
lemma 1. Even though the number of inequalities of this canonical form makes an ex- 
plicit representation impractical, symbolic representations based on decision diagrams 
can manipulate sets of unit inequalities efficiently. 

Definition 3 (Canonical form of octahedra). The canonical form of an octahedron 
O C Q” is either (i) the empty octahedron or (ii) a system of ‘iP unit linear inequalities, 
where in each inequality (C^X > k), k is the tightest bound satisfied by O. 

Theorem 1. 7vvo octahedra 0\ and O 2 represent the same subset o/Q" if and only if 
they both have the same canonical form. 

Proof. {-^) Given a constraint (C^X > k), there is a single tightest bound to that con- 
straint. So if two octahedra are equal, they will have the same bound for each possible 
linear constraint, and therefore, the same canonical form. □ 

(<— ): From its definition, an octahedron is completely characterized by its system 
of inequalities. If two octahedra Oi and O 2 have the same canonical form, then they 
satisfy exactly the same system of inequalities and therefore are equal. □ 
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A = {(4 > X > 2) A (7 > y > 4)} 

B = {(5 > X > 1) A (3 > y > 1)} 

C-hull = {(5 > X > 1) A (7 > y > 1) A 

(4x — y > 1) A ( — 4x — y > -23)} 

O-hull = {(5 > X > 1) A (7 > y > 1) A 

(x — y > 5) A ( — X — y > —11)} 



C-hull(A,B) 



0-hull(A,B) 



Fig. 2. Two upper approximations of the union: convex hull (C-hull) and octahedral hull (0-hull). 



Theorem 2. Let A and B be two non-empty octahedra represented by systems of in- 
equalities of the form {C'^X > ka) and X > kb) for all C € {—1,0,+!}”. The 
intersection A O B is defined by the system of inequalities {C'^ X > m.a,x{ka, kt)), 
which might be in non-canonicalform even if the input systems were canonical. 

Proof Any point P G Q” that satisfies {C^P > ma.x{ka, kb)) will also satisfy 
{C^P > ka) and {C'^P > kb). Therefore, any point P satisfying the new system 
of inequalities will also appear in both A and B. □ 

Lemma 2. An octahedron B is an upper approximation of an octahedron A, noted 
A f B, iff (i) A is empty or (ii)for any constraint {C'^ X > ka) in the canonical form 
of A, the equivalent constraint {C'^X > kb) in the canonical form of B has a constant 
term kb such that (ka > kb). 

Proof. By definition, A C i? iff A = A n 5. This lemma is a direct consequence of 
this property and Theorem 2. □ 

Definition 4 (Convex and octahedral hull). The convex hull (C-hull) of two convex 
polyhedra A and B is the intersection of all convex polyhedra that include both A and 
B. The octahedral hull (0-huH) of two octahedra A and B is the intersection of all 
octahedra that include both A and B. 

Figure 2 shows an example of the convex and octahedral hulls of two octahedra A 
and B. Notice that the convex hull is always an upper approximation of the union, and 
the octahedral hull is always an upper approximation of the convex hull, i.e. A U i? C 
C-hull(A,P) cO-hull(A,P). 

Theorem 3. Let A and B be two non-empty octahedra whose canonical form are re- 
spectively X > ka) and {C"^X > kb) for all C G {—1, 0, +1}". Then, the octahe- 
dral hull 0-hutl{A, B) is defined by the system of inequalities > min(fca, kb)) 

Proof. Given a bound k for one inequality (C^A > k) of 0-hull(A, B), the proof can 
be split into two parts: proving that k < min(fca, kb) and proving that k > min(fca, kb). 

As the octahedral hull includes A and B, all points P G A and P G B should 
also be in 0-hull(A, B). Therefore, any point in A or P should satisfy the constraints 
of 0-hull(A, B). Given a constraint (C^A > k), it is known that points in A satisfy 
{C^X > ka) and points in B satisfy (C^A > kb). If both sets of points must satisfy 
the constraint in 0-hull(A, B), then k must satisfy k < min(fca, kb). 

On the other side, the octahedral hull is the least octahedron that includes A and 
B. Therefore, the bounds of each constraint should be as tight as possible, i.e. as large 
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as possible. If we know that k < min(A:a, kt) should hold for a given unit inequality, 
the tightest bound for that inequality is precisely k = min(fca, kb). As a corollary, the 
octahedral hull computed in this way is in canonical form. □ 



3.2 Abstractions of Octahedra 



As it was shown in the previous section, the canonical form of an octahedron provides 
a useful mechanism to dehne operations such as the test for inclusion, the intersection 
or the octahedral hull. However, finding an efficient algorithm that can compute the 
canonical form of an octahedron from a non-canonical system of inequalities is an open 
problem at the time of writing this paper. 

On the other hand, octahedra are defined in the context of abstract interpretation of 
numeric properties. In this context, the problem is the abstraction of a set of values in 
Q”, and the main concern is ensuring that our abstraction is an upper approximation of 
the concrete set of values. Thus, as long as an upper approximation can be guaranteed, 
an exact representation of octahedra is not required, as octahedra are already abstrac- 
tions of more complex sets. Keeping this fact in mind, efficient algorithms that operate 
with upper approximations of the canonical form can be designed. 

The first step is the definition of a relaxed version of the canonical form, which 
is called saturated form. While the canonical form has the tightest bound in each of 
its inequalities, the bounds in the saturated form may be more relaxed. A system of 
unit inequalities is in saturated form as long as the bounds imposed by the sum of any 
pair of constraints appear explicitly. For example, a saturated form of the octahedron 
(a > 3) A (6 > 0) A (c > 0) A (6 — c > 7) A (a + 6 > 8) A (a + c > 6) can be defined 
by the following system of inequalities: 

(a > 3) A (6 > 7) A (c > 0) A (a + & > 10) A (a + c > 6) A (& + c > 7) 

A{b— c> 7) A {a + b — c> 10) A (a + & + c > 13) 

where the constraints with a bound of —oo have been removed for brevity. In this ex- 
ample, saturation has exposed explicitly that {a + b> 10). This inequality is the linear 
combination of (a > 3), (6 — c > 7) and (c>0). 

A saturated form O* of an octahedron O = {X \ AX > B A X > 0"} can be 
computed using the following saturation procedure: 



1 . Initialize the system of 3” unit inequalities for all possible values of the coefficients 
C G {—1, 0, -fl}”. The bound fc of a given inequality (C^X > k) is chosen as: 



k = 



max(0, b) if C^X > b appears in AX > B and C > 0" 

b if C'^X > b appears in AX > B and C ^ 0". 

0 if C^X > b does not appear in AX > B but C > 0" 

— cxD otherwise 



2. Select two inequalities CfX > k\ and CjX > /c 2 such that k\ > —oo and 
^2 > — oo. Let us define C* = Ci -f C 2 and A* = -f k 2 - 

3. If C* ^ {—1, 0, -fl}" return to step 2. 

4. If Cj X > k appears in the system of inequalities with k > fc*, return to step 2. 
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5 . Replace the inequality Cj X > khy Cj X > k^. 

6. Repeat steps 2-5 until: 

- A iixpoint is reached or 

- An inequality Cj X > k with C = 0 ” and fc > 0 is found. In this case, the 
octahedron is empty. 

Theorem 4. Let O = {X \ AX > B A X > 0 "} be a non-empty octahedron. The 
saturation algorithm applied to O terminates. 

Proof. Each step of the saturation algorithm defines a tighter hound for an inequality 
of the octahedron. The new inequality (CjX > k'f} is obtained from two previously 
known inequalities {Cf X > ki) and (CjAl > k2), so that C3 = Ci + C2 and 
/cg = + ^2, and ^3 > h, where fcs is the previously known hound for the inequality. 

If inequalities 1 and 2 were computed in previous rounds of the saturation algorithm, 
this dependency chain can be expanded, e.g. if inequality 2 comes from inequalities 4 
and 5 , then C3 = C1+C4+C5 and /cg = fci + fc4+fc5. Non-termination of the saturation 
algorithm implies that there will be infinitely many sums of pairs of inequalities. Ignor- 
ing the bound k, there are only finitely many inequalities over n variables. Therefore, it 
is always possible to find a step that computes a bound /c' that depends on a previously 
known bound kj, i.e. Cj = Cj+Y^ Ci and k'- = kj+Yh- As Cj ~Cj=Y^i= 0 " 
and fc' — kj = Y ki > 0, the linear combination ((^ C{Y X > fY ki)) is equivalent 
to (0 > 0 ), which implies that O is empty. □ 

At each step, the saturation algorithm computes a new linear combination between 
two unit inequalities. If this linear combination has a tighter bound than the one already 
known, the bound is updated, and so on until a fixpoint is reached. Notice that this fix- 
point may not be reached if the octahedron is empty. For example, the octahedron in 
Fig. 3 (a) is empty because the sum of the last four inequalities is (0 > 4 ). The satura- 
tion algorithm applied to this octahedron does not terminate. Adding the constraints in 
bottom-down order allows the saturation algorithm to produce (2:2 — 2:4 > 5 ), which can 
again be used to produce (x2 — 2:4 > 9 ) and so on. Even then, the saturation algorithm is 
used to perform the emptiness test because of two reasons. First, there are special kinds 
of octahedra where termination is guaranteed. For instance, if all inequalities describe 
constraints between symbols (all constant term is zero), saturation is guaranteeed to ter- 
minate. Second, the conditions required to build an octahedron for which the saturation 
algorithm does not terminate are complex and artificial, and therefore they will rarely 
occur. 

Even if the saturation algorithm terminates, in some cases it might fail to discover 
the tightest bound for an inequality. For example, in the octahedron in Fig. 3 (b), sat- 
uration will fail to discover the constraint — 2:2 + 2:3 + 2:4 -f 2:5 -f xe > 6), as 
any sum of two inequalities will yield a non-unit linear inequality. Therefore, given a 
constraint > fcg) in the saturated form, the bound kc for the same inequality in 

the canonical form may be different, kc kg. But kc > fcs always holds, as kc is the 
tightest bound for that inequality. Using this property, operations like the union or inter- 
section that have been defined for the canonical form can also be used for the saturated 
form. The result will always be an upper approximation of the exact canonical result, 
as kc > fcs is the exact definition for upper approximation of octahedra (Lemma 2 ). 
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+ X2 — X4 



> 1 



—xi — a ;2 + a;3 + 2:4 + a;s — a;6 > 1 

+Xl - *2 - *3 + - *5 + *6 > 1 

+a:i + 2:2 + 2:3 — ®4 — 2:5 — > 1 

—2:1 + 2:2 — 2:3 — 2:4 + ®5 + a :6 > 1 



+2:1 — 2;2 — 2:3 + 2:4 > 1 

— 2;i — 2;2 + 2:3 + 2;s > 2 

+2;i + 2:2 + 2:3 + 2;e > 3 



(a) 



(b) 



Fig. 3. (a) Empty octahedron where the saturation algorithm does not terminate and (b) Non- 
empty octahedron where the saturated form is different from the canonical form. 

3.3 Abstract Semantics of the Operators 

In order to characterize the octahedron abstract domain, the abstract semantics of the 
abstract interpretation operators must be defined. Intuitively, this abstract semantics is 
defined as simple manipulations of the saturated form of octahedra. All operations are 
guaranteed to produce upper approximations of the exact result, as it was justified in 
section 3.2. Some operations like the intersection can deal with non-saturated forms 
without any loss of precision, while others like the union can only do so at the cost of 
additional over-approximation. 

In the definition of the semantics, A and B will denote octahedra, whose saturated 
forms contain inequalities of the form {C^X > ka) and {C^ X > kb), respectively. 

- Intersection AnS is represented by system of inequalities ((7^X>max(fcQ, kb)), 
which might be in non-saturated form. 

- Union A U i? is approximated by the saturated form > min(A:a, kb)). 

- Inclusion Let A and B be two octahedra. If ka > kb for all inequalities in their 
saturated form, then A C B. Notice that the implication does not work in the other 
direction, i.e. if ka ^ kb then we don’t know whether A C B or A ^ B. 

- Widening AXB is defined as the octahedron with inequalities {C^X > k) such 
that k: 



As established in [16], the result should not be saturated in order to guarantee con- 
vergence in a finite number of steps. 

- Extension An octahedron O can be extended with a new variable y > 0 by mod- 
ifying the constraints of its saturated form O* . Let {ci ■ Xi + . . . + Cn ■ Xn > k) 
be a constraint of O*, the inequalities that will appear in the saturated form of the 
extension are: 

• Cl • xi -I- . . . -I- c„ • — 1 • y > —00 

• Cl • xi -I- . . . -I- c„ • -I- 0 • y > fc 

• Cl • xi -I- . . . -I- c„ • -I- 1 • y > fc 

- Projection A projection of an octahedron O removing a dimension Xi can be per- 

formed by removing from its saturated form O* all inequalities where Xi has a 
coefficient that is not zero. 

- Unit linear assignment A unit linear assignment [xt := J2jLi Cj ' with coeffi- 
cients Ci G { — 1, 0, -1-1} can be defined using the following steps: 




if ka > kb 
otherwise 
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• Extend the octahedron with a new variable t. 

• Intersect the octahedron with the octahedron (t = ' ^j) 

• Project the variable Xi. 

• Rename t as Xi. 

Impact of the conservative inclusion test on abstract interpretation: Using these 
operations, upper approximations of the concrete values will be computed in abstract 
interpretation. A special mention is the case of test of inclusion, where the result is 
only definite if the answer is true. Intuitively, this lack of accuracy appears from the 
impossiblity to discover the tightest bound with saturation. In abstract interpretation, the 
analysis is performed until a hxpoint is reached, and the fixpoint is detected using the 
test for inclusion. The inaccurate test of inclusion might lead to additional iterations in 
the abstract interpretation loop. Each iteration will add new constraints to our octahedra 
that were not being discovered by saturation, until the test for inclusion is able to detect 
the fixpoint. However, in practical examples, this theoretical scenario does not seem to 
arise, as constraints tend to be generated in a structured way that allows saturation to 
obtain good approximations of the exact canonical form. 

4 Octahedra Decision Diagrams 

4.1 Overview 

The constraints of an octahedron can be represented compactly using a specially devised 
decision diagram representation. This representation is called Octahedron Decision Di- 
agram (OhDD). Intuitively, it can be described as a Multi-Terminal Zero-Suppressed 
Ternary Decision Diagram: 

- Ternary. Each non-terminal node represents a variable Xi and has three output arcs, 
labelled as{ — 1,0,-|-1}. Each arc represents a coefficient of Xi in a linear constraint. 

- Multi-Terminal [10]: Terminal nodes can be constants in 1RU{— oo}. The semantics 
of a path a from the root to a terminal node k is the linear constraint {ci-xi-\-C 2 - 
X 2 -\- ■ ■ .-\-Cn-Xn > k), where Ci is the coefficient of the arc taken from the variable 
Xi in the path a. 

- Zero-Suppressed [14]: If a variable does not appear in any linear constraint, it also 
does not appear in the OhDD. This is achieved by using special reduction rules as 
it is done in Zero-Suppressed Decision Diagrams. 

Figure 4 shows an example of a OhDD and the octahedron it represents on the right. 
The shadowed path highlights one constraint of the octahedron, {x -\- y — z > 2). All 
constraints that end in a terminal node with — oo represent constraints with an unknown 
bound, such as {x — y > —oo). As the OhDD represents the saturated form of the 
octahedron, some redundant constraints such as (x -f y -f z > 3) appear explicitly. 

This representation based on decision diagrams provides three main advantages. 
First, decision diagrams provide many opportunities for reuse. For example, nodes in a 
OhDD can be shared. Furthermore, different OhDD can share internal nodes, leading 
to a greater reduction in the memory usage. Second, the reduction rules avoid repre- 
senting the zero coefficients of the linear ineqnalities. Finally, symbolic algorithms on 
OhDD can deal with sets of inequalities instead of one inequality at a time. All these 
factors combined improve the efficiency of operations with octahedra. 
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> 2 
> 0 
> 0 

> 3 

> 2 
> 2 
> 3 

Fig. 4. An example of a OhDD. On the right, the constraints of the octahedron. 

4.2 Definitions 

Definition 5 (Octahedron Decision Diagram - OhDD). An Octahedron Decision Di- 
agram is a tuple (V, G) where V is a finite set of positive real-valued variables, and 
G = {N U K, E) is a labeled single rooted directed acyclic graph with the following 
properties. Each node in K, the set of terminal nodes, is labeled with a constant in 
IR U {— oo}, and has an outdegree of zero. Each node n G N is labeled with a variable 
v(n) G V, and it has three outgoing arcs, labeled —, 0 and +. 

By establishing an order among the variables of the OhDD, the notion of ordered 
OhDD can be defined. The intuitive meaning of ordered is the same as in BDDs, that is, 
in every path from the root to the terminal nodes, the variables of the decision diagram 
always appear in the same order. For example, the OhDD in Fig. 4 is an ordered OhDD. 

Definition 6 (Ordered OhDD). Let y be a total order on the variables V of a OhDD. 
The OhDD is ordered if, for any node n G N, all of its descendants d G N satisfy 
v{d) >- v{n). 

In the same way, the notion of a reduced OhDD can be introduced. However, the 
reduction rules will be different in order to take advantage of the structure of the con- 
straints. In an octahedron, most variables will not appear in all the constraints. Avoiding 
the representation of these variables with a zero coefficient would improve the efficiency 
of OhDD. This can be achieved as in ZDDs by using a special reduction rule; whenever 
the target of the — arc of a node n is — oo, and the 0 and + arcs have the same target 
m, n is reduced as m. The rationale behind this rule is the following: if a constraint 
(ci • xi + . . . + Ci • Xi + . . . + c„ • > k) holds for Ci = 0, it will also hold for Ci = +1 

as Xi > 0. However, it is not known if it will hold for Ci = —1. This means that in 
the OhDD, if a variable has coefficient zero in a constraint, it is very likely that it will 
end up creating a node where the 0 and + arcs have the same target, and the target of 
the — arc is — oo. By reducing these nodes, the zero coefficient is not represented in the 
OhDD. Remarkably, using this reduction rule, the set of constraints stating that “any 
sum of variables is greater or equal to zero” is represented only as the terminal node 0. 

Figure 5 shows an example of the two reduction rules. Notice that contrary to BDDs, 
nodes where all arcs have the same target will not be reduced. 

Definition 7 (Reduced OhDD). A reduced OhDD is an ordered OhDD where none 
of the following rules can be applied: 
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Fig. 5. Reduction rules for OhDD. 



- Reduction of zero coefficients; Let n G N be a node with the — arc going to the 
terminal — oo, and with the arcs 0 and + point to a node m. Replace n by m. 

- Reduction of isomorphic subgraphs; Let D\ and D 2 be two isomorphic subgraphs 
of the OhDD. Merge D\ and D 2 - 

4.3 Implementation of the Operations 

The octahedra abstract domain and its operations have been implemented as OhDD on 
top of the CUDD decision diagram package [23]. Each operation on octahedra performs 
simple manipulations such as computing the maximum or the minimum between two 
systems of inequalities, where each inequality is encoded as a path in a OhDD. These 
operations can be implemented as recursive procedures on the decision diagram. The 
algorithm may take as arguments one or more decision diagrams, depending of the 
operation. All these recursive algorithms share the same overall structure: 

1. Check if the call is a base case, e.g. all arguments are constant decision diagrams. 
In that case, the result can be computed directly. 

2. Look up the cache to see if the result of this call was computed previously and is 
available. In that case, return the precomputed result. 

3. Select the top variable t in all the arguments according to the ordering. The al- 
gorithm will only consider this variable during this call, leaving the rest of the 
variables to be handled by the subsequent recursive calls. 

4. Obtain the cofactors of t in each of the arguments of the call. In our case, each 
cofactor represents the set of inequalities for each coefficient of the top variable. 

5. Perform recursive calls on the cofactors of t. 

6. Combine the results of the different calls into the new top node for variable t. 

7. Store the result of this recursive call in the cache. 

8. Return the result to the caller. 

The saturation algorithm is a special case: all sums of pairs of constraints are computed 
by a single traversal; but if new inequalities have been discovered, the traversal must be 
repeated. The process continues until a fixpoint is reached. Even though this fixpoint 
might not be reached, as seen in Eig. 3, the number of iterations required to saturate an 
octahedron tends to be very low (1-4 iterations) if it is derived from saturated octahedra, 
e.g. the intersection of two saturated octahedra. 

These traversals might have to visit 3” inequalities/paths in the OhDD in the worst 
case. However, as OhDD are directed graphs, many paths share nodes so many recursive 
calls will have been computed previously, and the results will be reused without the need 
to recompute. The efficiency of the operations on decision diagrams depends upon on 
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two very important factors. The first one is the order of the variables in the decision 
diagram. Intuitively, each call should perform as much work as possible. Therefore, 
the variables that appear early in the decision diagram should discriminate the result as 
much as possible. Currently there is no dynamic reordering [21] in our implementation 
of OhDD, but we plan to add it in the near future. A second factor in the performance 
of these algorithms is the ejfectivity of the cache to reuse previously computed results. 

5 Applications of the Octahedron Abstract Domain 

5.1 Motivating Application 

Asynchronous circuits are a kind of circuits where there is no global clock to syn- 
chronize its different components. Asynchronous circuits replace the global clock by a 
local hand-shake between components, gaining several advantages such as lower power 
usage. However, the absence of a clock makes the verification of asynchronous cir- 
cuits more complex. The lack of clock makes the circuit more dependent on timing 
constraints that ensure the correctness of the synchronization within the circuit. This 
means that the correctness of the circuit depends on the delays of its gates and wires. 

In many asynchronous circuits implementing control logic, the timing constraints 
that arise are unit inequalities. Intuitively, they correspond to constraints of the type 

((5l + • • • -t- 6fj — (^i+l -!-■■•+ ^n) ^ k 
delay(pathi) delay(path 2 ) 

hinting that certain paths in the circuit must be longer than other paths. In very rare oc- 
casions, coefficients different from ±1 are necessary. A typical counterexample would 
be a circuit where one path must be c times longer than another one, e.g. a fast counter. 

Example. Figure 6(a) depicts a D flip-flop [20]. Briefly stated, a D flip-flop is a 1- 
bit register. It stores the data value in signal D whenever there is a rising edge in the 
clock signal CK. The output Q of the circuit is the value which was stored in the last 
clock rising edge. We would like to characterize the behavior of this circuit in terms 
of the internal gate delays. The flip-flop has to be characterized with respect to three 
parameters (see Figure 6(b)): 

- Setup time, noted as Tsetup, is the amount of time that D should remain stable 
before a clock rising edge. 

- Hold time, noted as Thoid, is the amount of time that D should remain stable after 
a clock rising edge. 

- Delay or clock-to-output time, noted as Tqk^q, is the amount of time required by 
the latch to propagate a change in the input D to the output Q. 

The timing analysis algorithm is capable of deriving a set of sufficient linear contraints 
that guarantee the correctness of the circuit’s behavior. This behavior will be correct if 
the output Q matches the value of D in the last clock rising edge. Any behavior not 
fulfilling this property is considered to be a failure. Fig. 6(c) reports the set of sufficient 
timing constraints derived by the algorithm. Each gate gi has a symbolic delay in the 
interval [di, Di]. Notice that the timing constraints are unit inequalities. 
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Fig. 6. (a) Implementation of a D flip-flop [20], (b) description of variables that characterize any 
D flip-flop and (c) sufficient constraints for correctness for any delay of the gates. 



Table 2. Experimental results using convex polyhedra and octahedra. 



Example 


States 


Variables 


Time - Poly (sec) 


Time - Oct (sec) 


nowick 


60 


30 


0.5 


0.1 


sbuf-read-ctl 


74 


31 


1.2 


1.4 


rev-setup 


72 


27 


2.1 


8.3 


alloc-outbound 


82 


39 


1.3 


0.2 


ebergen 


83 


27 


1.3 


1.7 


mp-forward-pkt 


194 


29 


1.9 


3.8 


chul33 


288 


26 


1.3 


1.0 



Experimental Results. Timing verification has been performed on several asynchro- 
nous circuits from the literature. This verification can be seen as the analysis of a set 
of clock variables, and the underlying timing behavior can be modeled as assignments 
and guards on these variables [4]. The analysis of clock variables has been performed 
using two different numeric abstractions: convex polyhedra and octahedra. The imple- 
mentation of polyhedra uses the New Polka polyhedra library [19], while the library 
of OhDD is implemented on top of the CUDD package [23]. Table 2 shows a compar- 
ison of the experimental results for some examples. All these examples were verified 
successfully using both octahedra and polyhedra, as all relevant constraints were unit 
linear inequalities. For all these cases, the execution time of convex polyhedra and octa- 
hedra is comparable, while the memory usage for octahedra is lower. For each example, 
we provide the number of different states (configurations) of the circuit, the number 
of clock and delay variables of the abstractions and the execution time required by the 
analysis with each abstraction. 

The difference in memory usage is quantified in the next example, an asynchronous 
pipeline with different number of stages and an environment running at a fixed fre- 
quency. The processing time required by each stage i has a processing time bounded by 
an interval, with unknown upper and lower bound [d, , Z?i] . Whenever a stage finishes its 
computation, it sends the result to the next stage if it is empty. The safety property being 
verified in this case was “the environment will never have to wait before sending new 
data to the pipeline", i.e. whenever the environment sends new data to the pipeline, the 
first stage is empty. Fig. 7 shows the pipeline, with an example of a correct and incorrect 
behavior. The tool discovers that correct behavior can be ensured if the following holds: 
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(a) 



Tout 






(b) * (c) * 



#of 
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# of 
States 


#of 

variables 


Polyhedra 


OhDD 


CPU Time 


Memory 


CPU Time 


Memory 


2 


36 


20 


0.6s 
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- 


- 
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Fig. 7. (a) Asynchronous pipeline with N=3 stages, (b) correct behavior of the pipeline and (c) 
incorrect behavior. Dots represent data elements. On the right, the CPU time and memory required 
to verify pipelines with different number of stages. 



dj]\[ > D\ A ... A d/jv > Djq f\ dj]\j > Dqut 

where Di is the delay of stage i, and diN and Dqut refer to environment delays. This 
property is equivalent to; 

d/AT > max{Di, . . . , Dj^, Dqut) 

Therefore, the pipeline is correct if the environment is slower than the slowest stage 
of the pipeline. Both the polyhedra and octahedra abstract domain are able to discover 
this property. This example is interesting because it exhibits a very high degree of con- 
currency. The verification times and memory usage for different lengths of the pipeline 
can be found in Fig. 7. Notice that the memory consumption of OhDD is lower than 
that of convex polyhedra. This reduction in memory usage is sufficient to verify larger 
pipelines (n = 6 stages) not verifiable with our convex polyhedra implementation. How- 
ever, this memory reduction comes at the expense of an increase in the execution time. 



5.2 Other Applications 

In general, the octahedron abstract domain may be interesting in any analysis problem 
where convex polyhedra can be used. Many times, the precision obtained with convex 
polyhedra is very good, but the efficiency of the analysis limits the applicability. In 
these scenarios, using octahedra might be adequate as long as the variables involved in 
the analysis are positive and unit linear inequalities provide sufficient information for 
the specific problem. Some examples of areas of applications are the following: 

- Analysis of program invariants involving unsigned variables. 

— Static discovery of bounds in the size of asynchronous communication channels'. 
Many systems communicate using a non-blocking semantics, where the sender 
does not wait until the receiver is ready to read the message. In these systems, each 
channel requires a buffer to store the pending messages. Allocating these buffers 
statically would improve performance but it is not possible, as the amount of pend- 
ing messages during execution is not known in advance. Analysis with octahedra 
could discover these bounds statically. This problem is related to the problem of 
structural boundedness of a Petri Net [18], where an upper bound on the number of 
tokens that can be in each place of the Petri Net must be found. 
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- Analysis of timed systems: Clocks and delays are restricted to positive values in 
many types of models. Octahedra can be used to analyze these values and discover 
complex properties such as timing constraints or worst-case execution 
time(WCET). 

- Analysis of string length in C programs [8]: Checking the absence of buffer over- 
flows is important in many scenarios, specially in the applications where security 
is critical, e.g an operating system. C programs are prone to errors related to the 
manipulation of strings. Several useful constraints on the length of strings can be 
represented with octahedra. For instance, a constraint on the concatenation of two 
strings can be strlen(strcat(si, S 2 )) = strlen(si) + strlen(s 2 )- 



6 Conclusions and Future Work 

A new numeric abstract domain called octahedron has been presented. This domain can 
represent and manipulate constraints on the sum or difference of an arbitrary number of 
variables. In terms of precision, this abstraction is between octagons and convex poly- 
hedra. Regarding complexity, the worst case complexity of octahedra operations over 
n variables is 0(3") in memory, and 0(3") in execution time in addition to the cost 
of saturation. However, worst-case performance is misleading due to the use of a de- 
cision diagram approach. For instance, BDDs have a worst-case complexity of 0(2"), 
but they have a very good behavior in many real examples. Performance in this case 
depends on factors such as the ordering of the variables in the decision diagram and the 
effectiveness of the cache. In the experimental results of OhDD, memory consumption 
was shown to be smaller than that of our convex polyhedra implementation. Running 
time was comparable to that of convex polyhedra in small and medium-sized exam- 
ples, while in more complex examples the execution time was worse. This shows that 
OhDD trade speed for a reduction in memory usage. 

Future work in this area will try to improve the execution time of octahedra opera- 
tions. For example, dynamic reordering [21] would improve efficiency if proper heuris- 
tics to find good variable orders can be developed. Another area where there is room for 
improvement is the current bottleneck of the representation, the saturation procedure. 
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Abstract. We describe data structures and algorithms for performing 
a path-sensitive program analysis to discover equivalences of expressions 
involving linear arithmetic or uninterpreted functions. We assume that 
conditionals are abstracted as boolean variables, which may be repeated 
to reflect equivalent conditionals. We introduce free conditional expres- 
sion diagrams (FCEDs), which extend binary decision diagrams (BDDs) 
with internal nodes corresponding to linear arithmetic operators or un- 
interpreted functions. FCEDs can represent values of expressions in a 
program involving conditionals and linear arithmetic (or uninterpreted 
functions). We show how to construct them easily from a program, and 
give a randomized linear time algorithm (or quadratic time for uninter- 
preted functions) for comparing FCEDs for equality. FCEDs are compact 
due to maximal representation sharing for portions of the program with 
independent conditionals. They inherit from BDDs the precise reasoning 
about boolean expressions needed to handle dependent conditionals. 



1 Introduction 

Data structures and algorithms for manipulating boolean expressions (e.g., bi- 
nary decision diagrams) have played a crucial role in the success of model check- 
ing for hardware and software systems. Software programs are often transformed 
using boolean abstraction [4] to boolean programs: arithmetic operations and 
other operators are modeled conservatively by their effect on a number of boolean 
variables that encode predicates on program state. In this paper, we show that 
we can reason efficiently and precisely about programs that contain not only 
boolean expressions but also linear arithmetic and uninterpreted functions. Such 
algorithms are useful when the desired level of precision cannot be achieved with 
boolean abstraction of linear arithmetic expressions in a program. 

Gonsider the program fragment shown in Figure 1. The atomic boolean ex- 
pressions in the conditionals (e.g. x < y,y == z) have been abstracted as boolean 
variables ci and C 2 - We assume that the conditional abstraction procedure can 

* This research was supported in part by the National Science Foundation Grant CCR- 
0081588, and gifts from Microsoft Research. The information presented here does 
not necessarily reflect the position or the policy of the Government and no official 
endorsement should be inferred. 
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Fig. 1. An example program fragment. 



sometimes detect equivalences of atomic boolean expressions (e.g. x < y and 
y > X are equivalent), as is the case for the first and last conditionals in the 
program. Suppose our goal is to determine the validity of the two assertions in 
the program. The first assertion holds because it is established on all four paths 
that can reach it. The second assertion holds only because the first and last con- 
ditionals use identical guards. A good algorithm for verifying these assertions 
should be able to handle such dependent conditionals (Two conditionals are de- 
pendent if truth- value of one depends on the other), or in other words perform a 
path-sensitive analysis, without individually examining an exponential number 
of paths that arise for portions of the program with independent conditionals. 

Since there is no obvious boolean abstraction for this example, we need to 
reason about the linear arithmetic directly. There are two kinds of algorithms 
known to solve this problem. On one extreme, there are abstract/random inter- 
pretation based polynomial-time algorithms, which perform a path-insensitive 
analysis. Karr described a deterministic algorithm [22] based on abstract inter- 
pretation [11]. Recently, we gave a faster randomized algorithm [18] based on 
random interpretation. These algorithms are able to decide the first assertion 
in the program since the first two conditionals preceding it are independent of 
each other. However, these algorithms cannot verify that the second assertion 
holds, because they would attempt to do so over all the eight paths through the 
program, including four infeasible ones. 

On the other extreme, there are multi-terminal binary decision diagram 
(MTBDD) [15] based algorithms that consider all feasible paths in a program 
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Fig. 2. The MTBDD representation for symbolic values of variables of the program 
in Figure 1. The internal nodes are conditionals whose left child corresponds to the 
conditional being true. The leaves are canonicalized linear arithmetic expressions. 



y= - 




3+locko 4+locko 0 1 



w= + 



locki = c 



h = C2 



locko+1 locko 2 3 



Fig. 3. The VDG/FGED representations for symbolic values of variables of the program 
in Figure 1. The internal nodes also involve arithmetic operations. This leads to succinct 
representations, and allows sharing. 



explicitly, and hence are able to decide both assertions in our example. However, 
these algorithms run in exponential time even when most of the conditionals in a 
program are independent of each other, which is quite often the case. MTBDDs 
are binary decision diagrams whose leaves are not boolean values but canonical- 
ized linear expressions. For the example program, the MTBDDs corresponding 
to final values of the various variables are shown in Figure 2. These MTBDDs use 
the same ordering of boolean variables and the same canonicalization for leaves. 
With MTBDDs we can verify both assertions; however note that checking equal- 
ity between w and y essentially involves performing the check individually on 
each of the four paths from the beginning of the program to the first assertion. 
Also note that there is little opportunity for sharing subexpressions in a MTBDD 
due to the need to push computations down to the leaves and to canonicalize the 
leaves. This algorithm is exponential in the number of boolean variables in the 
program. Its weak point is the handling of sequences of independent conditionals 
and its strong point is that it can naturally handle dependent conditionals, just 
like a BDD does for a boolean program. 

In this paper, we describe data structures and algorithms that combine 
the efficiency of the path-insensitive polynomial-time algorithms with the pre- 
cision of the MTBDD-based algorithms. Consider representing the values of 
w and y using value dependency graph (VDG) [28], as shown in Figure 3. 
Such a representation can be easily obtained by symbolic evaluation of the 
program. Note that this representation is exponentially more succinct than 
MTBDDs. For example, note that |VDG( 2 /)| = |VDG(6)| -I- |VDG(a)| while 
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|MTBDD(y)| = |MTBDD(6)| x |MTBDD(a)| (here |VDG(y)| denotes the size 
of VDG representation for y). This is because VDGs do not need to maintain 
a normal form for expressions unlike MTBDDs, which even require a normal 
form for their leaves. For example, w and y, which are equivalent expressions, 
have distinct VDG representations as shown in Figure 3 . A VDG for any expres- 
sion can share nodes with the VDGs for its subexpressions. For example, note 
that VDG(j/) shares nodes with VDG(&) and VDG(a). On the other hand, an 
MTBDD typically cannot exploit any sharing that is induced by the order in 
which a program computes expressions. 

The challenge now is to check equivalence of two VDGs. We do not know 
of any efficient deterministic algorithm to solve this problem. We show in this 
paper a randomized algorithm that can check equivalence of two free VDGs in 
linear time. A VDG is said to be free if every boolean variable occurs at most 
once on any path from the root node to a leaf. Note that if all conditionals in a 
program are independent of each other, then the VDG for any expression in the 
program is free. For example, the VDGs shown in Figure 3 are free. 

In this paper, we propose Free Conditional Expression Diagrams (FGEDs), 
which are a generalization of free VDGs. We describe a transformation that gen- 
erates an FGED for any expression in a loop-free program, and a randomized 
algorithm that checks equivalence of two FGEDs in linear time. This, in turn, 
gives an algorithm for checking the validity of assertions e\ = ei in programs 
that contain linear arithmetic and conditionals. This algorithm is more efficient 
than the MTBDD-based algorithm. In particular, if all conditionals in a pro- 
gram are independent of each other, then this algorithm is as fast as the random 
interpretation based algorithm, which runs in polynomial time, as opposed to 
the MTBDD-based algorithm, which has exponential cost. However, the new 
algorithm still has the same worst-case complexity as the MTBDD-based algo- 
rithm (This happens when all conditionals in the program are arbitrary boolean 
expressions involving the same set of boolean variables) . This is not surprising 
since the problem of checking equality assertions in a program with dependent 
conditionals is NP-hard and it is generally believed that even randomized algo- 
rithms cannot solve such problems in polynomial time. 

In Section 2 , we describe the FGED construction and the randomized equiva- 
lence testing algorithm for conditional linear arithmetic expressions. In Section 3 , 
we describe the FGED construction and the randomized equivalence testing al- 
gorithm for conditional uninterpreted function terms. 

2 Analysis for Linear Arithmetic 

2.1 Problem Definition 

Let Ca be the following conditional arithmetic expression language over rational 

constants q, rational variables x, boolean variables c, and boolean expressions b. 

e ::= q \ x \ e\ + €2 \ 61—62 | q x e \ if b then 61 else 62 

b ::= c I &i A 62 | 61 V 62 
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We want a data structure FCED to succinctly represent the expressions in lan- 
guage La and support efficient algorithms for the following two problems: 

PI. Given the FCEDs for the sub-expressions of an expression e G £a, construct 
the FCED for the expression e. 

P 2 . Given the FCED representations for two expressions ei,C2 G £a, decide 
whether ci = 62- 

Note that the symbolic value of any expression in our example program be- 
longs to the language La- For example, the value of lock\ is “ if c\ then locko -L 
1 else lockfk Hence, algorithms for problems PI and P 2 can be used to check 
equivalence of two expressions in a loop-free program. In general, if a program 
has loops, then since the lattice of linear equality facts has finite height k (where 
k is the number of variables in the program), one can analyze a suitable unrolling 
of the loops in the program to verify the assertions [ 22 , 18 ]. 

Note that we assume that there is an abstraction procedure for condition- 
als that maps atomic conditionals to boolean variables such that only equiv- 
alent conditionals are mapped to the same boolean variable. Equivalent con- 
ditionals can be detected by using standard value numbering heuristics [ 25 , 1 ] 
(ci relop 62 = e'l relop if ei = e[ and 62 = and relop = relop') or other 
sophisticated heuristics [ 24 ] (e.g. e\ relop 62 = e'^ relop' if ei — 62 = e'^ — 
and relop = relop'). Here relop stands for a relational operator, e.g. =,< or >. 
Note that detecting equivalence of conditionals involves detecting equivalence of 
expressions, which in turn can be done by using a simple technique like value 
numbering. We can even use the result of our analysis to detect those equiva- 
lences on the fly. 



2.2 FCED Construction 

An FCED for linear arithmetic is a DAG generated by the following language 
over rational constants q, rational variables x and boolean expressions g, which 
we call guards. 

f::=x I q \ Plus{fi, f2) \ Minus{fi, f2) 

I Times{q,f) \ Choose{fi, f2) \ Guardg(f) 

The Choose and Guard node types are inspired by Dijkstra’s guarded command 
language [ 14 ] . Given a boolean assignment p, the meaning of Guardg (/) is either 
the meaning of / (if g is true in p) or undefined (otherwise). The meaning of a 
Choose node is the meaning of its child that is defined. The Choose operator 
here is deterministic in the sense that at most one of its children is defined given 
any boolean assignment. 

The guards g are represented using Reduced Ordered Binary Decision Di- 
agrams (ROBDDs). Let ^ be the total ordering on program variables used in 
these ROBDD representations. For any sets of boolean variables Bi and B2, we 
use the notation Bi B2 to denote that B1LB2 = % and c\ =4 C2 for all variables 
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Cl G B\ and C2 G B^- The guards g can be described by the following language 
over boolean variables c. 



g::=true \ false \ c \ 1/(0,51,32) 

We assume that we can compute conjunction (A) of two guards and negation 
(^) of a guard. For any boolean guard 5, let BV{g) denote the set of boolean 
variables that occur in g. Similarly, for any FCED node /, let BV{f) denote the 
set of boolean variables that occur below node /. An FCED / must satisfy the 
following invariant: 

Invariant 1 For any guard node Guardg^(fi) in FCED f, BV{gi) A BV{fi). 

Invariant 1 is similar to the ROBDDs’ requirement that boolean variables on any 
path from the root node to a leaf must be ordered. As we shall see, it plays an 
important role in the randomized equivalence testing algorithm that we propose. 

The FCED representation of any expression e is denoted by FCED{e) and 
is computed inductively as follows: 

FCED{x) = X 
FCED{q) = q 

FCED{ei+e2) = Plus{FCED{ei),FCED{e2)) 

FCED{e\ — 62) = Minus{FCED{e\), FCED{e2)) 

FCED{q X e) = Times{q, FCED{e)) 

FCED{if b then ci else 62) = Choose{\\gb, FC ED{ei)\\, \\^gb,FCED{e2)\\), 

where gb is the ROBDD representation of the boolean expression 5 as a guard. 
The normalization operator ||5,/|| takes as input a boolean guard 5 and an 
FCED / and returns another FCED whose meaning is equivalent to Guardg{f), 
except that Invariant I is satisfied: 



\\gJ\ 

\\gJ\ 

\\g,Plus{fij2)\ 
\\g,Minus{fij2)\ 
\\g,Times{q, f)\ 
\\g,Ghoose{fij2)\ 
\\g,Guardg>{f')\ 



Guardg(f), if BV{g) BV{f) 

Guardg{f[g]), if g is a conjunction of literals 

Plus{\\gji\\,\\g,f2\\) 

Minus{\\g, fi\\, \\gJ2W) 

Times{q,\\g,f\\) 

Ghoose{\\gJi\\,\\g,f2\\) 

Guardg^iWgJ'W), tf BV{g') < BV{g) 
\\g^g\f'\\ otherwise 



where f[g] denotes the FCED obtained from / by replacing any boolean variable 
c by true or false, if it occurs in 5 in non-negated or negated form respectively. 
The purpose of the normalization \\g, f\\ is to simplify / or to push the guard 5 
down into / until a point when the boolean variables in 5 and / are disjoint, thus 
ensuring that Invariant I is maintained. Figure 4 shows the FCED for variable 
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Fig. 4. An example of FCED and normalization operator. 



lock2 in our example program. Figure 4 also shows the FCED for ||ci,t(;||, where 
the FCED for w has been shown in Figure 3. We use the notation c(/i,/2) 
as a syntactic sugar for the FCED Choose{Guardc{fi),Guard^c{f2))- We also 
simplify an FCED Ghoose{Guardg{fi),Guardfaise{f2)) to Guardg(fi). 

2.3 Randomized Equivalence Testing 

In this section, we describe an algorithm that decides equivalence of two FCEDs. 
The algorithm assigns a hash value V(n) to each node n in an FCED, computed 
in a bottom-up manner from the hash values of its immediate children. The hash 
value of an FCED is defined to be the hash value assigned to its root. Two FCEDs 
are declared equivalent iff they have same hash values. This algorithm has a one- 
sided error probability. If two FCEDs have different hash values, then they are 
guaranteed to be non-equivalent. However, if two FCEDs are not equivalent, then 
there is a very small probability (over the random choices made by the algorithm) 
that they will be assigned same hash values. The error probability can be made 
arbitrarily small by setting the parameters of the algorithm appropriately. 

For the purpose of assigning a hash value to an FCED representation of any 
expression in £„, we choose a random value for each of the boolean and rational 
variables. The random values for both kind of variables are chosen independently 
of each other and uniformly at random from some finite set of rationale. (Note 
that we choose a rational random value even for boolean variables). For any 
variable y, let ry denote the random value chosen for y. The hash value V(n) is 
assigned inductively to any node n in an FCED as follows: 

V{q) = q 
V (x) = 

V{Plus{hj2)) = V{h) + V{f2) 

V{Minus{hj2)) = V{h) - V(f2) 

V{Times{q, f)) = qx V{f) 

V{Ghoose{hj2)) = E(/i) + V{f2) 

V{Guardg{f)) = H{g) x V{f) 

where the hash function H for a boolean guard g is as defined below. 
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H{true) = 1 
H(false) = 0 
H{c) = Tc 

H{If{c,gi,g 2 )) = rc X H{gi) + (1 - r^) x H{g 2 ) 

For example, note that w = (if ci then lockg + 1 else locko) + {if C 2 then 2 
else 3) and y = {if C 2 then 3 -I- locko else 4 + locko) ~ {if ci then 0 else 1) in our 
example program. If we choose riocko = 3, = 5,rc2 = —3, then V{w) = 

V {y) = 14, thereby validating the assertion w = y. li we choose random boolean 
values for boolean variables while computing hash values, then we would es- 
sentially be hashing the symbolic values of expressions on one random path 
(corresponding to the random boolean choice). However, it is essential to check 
for the equivalence of expressions on all paths. Choosing non-boolean random 
values for boolean variables help us to do that by essentially computing a ran- 
dom weighted combination of the hash values of expressions on all paths. In the 
next section, we explain more formally why, with high probability, this hashing 
scheme assigns equal values only to equivalent expressions. 



2.4 Completeness and Probabilistic Soundness of the Algorithm 

Let e be any expression in language £a- Let P{FCED{e)) denote the polynomial 
obtained by using variables x and c instead of random values and Xc, while 
computing V{FCED{e)). The following properties hold. 

Tl. V{FCED{e)) is the result of evaluating the polynomial P{FCED{e)) at 
random values Xy chosen for each variable y that occurs in P{FCED{e)). 
T2. For any FCED /, P{f) is a multi-linear polynomial, i.e. the degree of any 
variable is at most 1. This is due to the freeness property of an FCED 
(ensured by Invariant I). 

T3. Cl = 62 iff P{FCED{ei)) and P{FCED{e 2 )) are equivalent polynomials. 

Property Tl is trivial to prove. The proof of property T2 is based on the obser- 
vation that Fl{g) is multi-linear for any guard g (this is because every boolean 
variable occurs at most once on any path from the root node to a leaf in an 
ROBDD), and for any Guaxdg{f) node in an FCED, BV{g) C BV{f) = 0. The 
proof of property T3 is given in the full version of the paper [20]. 

These properties imply that the equivalence testing algorithm is complete, 
i.e., it assigns same hash values to equal expressions. Suppose e\ and 62 are equal 
expressions. It follows from T3 that P{FCED{e\)) = P{FCED{e 2 ))- Since 
P{FCED{ei)) and P{FCED{e 2 )) are multi-linear (implied by T2), they are 
equivalent even when the boolean variables are treated as rational variables. This 
is a standard fact and is the basis of several algorithms [5, 17, 13, 12]. Therefore, 
it follows from Tl that V{FCED{e\)) = V{FCED{e 2 ))- 

Properties Tl and T3 imply that the algorithm is probabilistically sound, 
i.e., it assigns different hash values to non-equivalent expressions with high 
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z= 1-u 




Fig. 5. The surface shows values of expression 1 — u for different values of ci and C 2 . 

probability over the random choices that it makes. Suppose ei ^ ei- It fol- 
lows from T3 that P{FCED{ei)) ^ P{FCED{e2)). Trivially, P{FCED{ei)) ^ 
P{FCED{e2)) even when boolean variables are treated as rational variables. It 
then follows from the classic Schwartz’s theorem [27] (on testing equivalence of 
two polynomials) that the probability that P{FCED(ei)) and P{FCED{e2)) 
evaluate to the same value on random assignment is bounded above by 
where d is the maximum of the degrees of the polynomials P{ECED{ei)) and 
P{FCED{e2)) (these are bounded above by the size of the expressions ei and 62 
respectively), and s is the size of the set from which random values are chosen. 
Therefore, it follows from T1 that Pr\V{FCED{ei)) ^ V{FCED{e2))] > 1 — f . 
(Here Pr\V {ECED{ei)) ^ V{ECED{e2))] denotes the probability of the event 
V{FCED{e\) 7^ V{FCED{e2)) over the choice of the random values Vy for all 
variables y.) 

Note that the error probability can be made arbitrarily small by choosing 
random values from a large enough set. For boolean variables, this set cannot 
contain more than 2 elements. It is precisely for this reason that we require prop- 
erty T2, so as to be able to treat boolean variables as rational variables without 
affecting equivalences of polynomials. Note that multi-linearity is a necessary re- 
quirement. For example, consider the two equivalent polynomials C2C3 -I- and 
Cl -I- C3C2 over the boolean variables ci and C2. These polynomials are not equiv- 
alent when the variables Ci and C2 are interpreted as rational variables since the 
first polynomial is not multi-linear in Ci. 

This randomized algorithm for equivalence checking can be explained infor- 
mally using a geometric argument. For example, consider the validity of the 
statement m = 1 at the place of the first assertion in Figure 1. This statement is 
false since it holds on only three of the four paths that reach it. It is false when ci 
is false and C2 is true. Figure 5 shows a surface in a 3-dimensional space whose 
2; coordinate reflects the value of expression 1 — u as a function of (rational) 
assignment for ci and C2. Since there is at least one boolean assignment for ci 
and C2 where 1 — m is not zero, and since the degree of the surface is small (2 
in this case), it follows that the surface intersects the C1-C2 plane in a “small” 
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e = 


ei ± 62 


if b then ei else 62 


q X ei 


F{ei, 62) 


T{e) for FCED = 


Constant 


S{gb) X (S(ei) -1- S{e 2 )) 


Constant 


Constant 


T(e) for MTBDD = 


S(ei) X S(62) 


S{gb) X (S(ei) -1- 5(62)) 


S(ei) 


S{e^) X S{e2) 



Fig. 6. A table comparing the time and space complexity T(e) for constructing FCEDs 
and MTBDDs of an expression from the representation of its subexpressions. 



number of points. This allows the quick discovery, with high probability, of this 
false assertion by random sampling of the surface (this corresponds to choosing 
random rational values for boolean variables) . If, on the other hand, the surface 
corresponds to a true assertion, then it is included in the C1-C2 plane and any 
sampling would verify that. 

2.5 Time and Space Complexity 

The time required to compute the hash value of an FCED is clearly linear in 
the size of the FCED. However, this is under the assumption that all basic 
arithmetic operations (like addition, multiplication) to compute the hash value 
can be performed in unit time. This assumption is not necessarily true since the 
size of the numbers involved may increase with each arithmetic operation. The 
standard technique to deal with this problem is to do the arithmetic operations 
modulo a randomly chosen prime p [23]. This makes sure that at each stage, the 
numbers can be represented within a constant number of bits and hence each 
arithmetic operation can be performed in constant time. The modular arithmetic 
adds an additional small probability of error in our algorithm. 

The time and extra space T(e) required to construct the FCED of an expres- 
sion e from the FCEDs of the subexpressions of e depends on the structure of 
e. If e is of the form q, x, ci ± 62, or g x e, then it is easy to see that T{e) is 
constant. If e is of the form if b then ei else 62, then an amortized cost analysis 
would show that r(e) = 0(S'(5b) x (5'(ei) -I- 5(62))), where gb = ROBDD{b) and 
S{gb) denotes the size of the ROBDD gt- S{ei) denotes the size of the FCED 
of expression ci (when represented as a tree; however, the boolean guards in Ci 
may be represented as DAGs) . The upper bound on time complexity for this case 
relies on Invariant 1 and assumes some sharing of common portions of ROBDDs 
that arise while construction of FCED{e). 

If all conditionals in a program are independent of each other, then, it is easy 
to see that FCED{e) is linear in size of e, as opposed to the possibly exponential 
size implied by the above-mentioned bounds on T(e). Figure 6 compares T(e) 
for FCED and MTBDD representations. The last column in the table refers to 
the next section. 

3 Analysis for Uninterpreted Functions 

Reasoning precisely about program operators other than linear arithmetic oper- 
ators is in general undecidable. A commonly used abstraction is to model any n- 
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ary non-linear program operator as an uninterpreted function under the theory of 
equality, which has only one axiom, namely, F{x \, . . , Xn) = F'(xi , . . , x'„) 

F = F' and Xi = x\ for all 1 < z < n. The process of detecting this form of 
equivalence, where the operators are treated as uninterpreted functions, is also 
referred to as value numbering. In this section, we describe how to construct 
FCEDs for uninterpreted functions. 



3.1 Problem Definition 

Let Cu be the following language over boolean expressions b, variables x and an 
uninterpreted function symbol F of arity two. 

e ::= X \ F{e\,e2) \ if b then e\ else 62 

For simplicity, we consider only one binary uninterpreted function F. Our results 
can be extended easily to languages with any finite number of uninterpreted 
functions of any finite arity. However, note that this language does not contain 
any linear arithmetic operators. 

We want a data structure to succinctly represent the expressions in language 
Cu and support efficient algorithms for the problems similar to those mentioned 
in Section 2 . 1 . This would be useful to check equivalence of two expressions in 
any loop-free program. As before, it turns out the lattice of sets of equivalences 
among uninterpreted function terms has finite height k (where k is the number 
of variables in the program) . Hence, if a program has loops, then one can analyze 
a suitable unrolling of loops in the program to verify assertions [ 19 , 21 ]. 



3.2 FCED Construction 

An FCED in this case is a DAG generated by the following language over vari- 
ables X and boolean guards g represented using ROBDDs. 

f::=x I E(/i,/2) I Choose{fi,f2) \ Guardg{f) 
g::=true \ false \ c \ If{c,gi,g2) 

Here c denotes a boolean variable. As before, an FCED satisfies Invariant 1 . The 
FCED representation of any expression e is computed inductively as follows: 

FCED{x) = X 

FCED{F{ei,e2)) = E{ECED{ei), ECED{e2)) 

ECED{if b then ei else 62) = Choose{\\gb, ECED{ei)\\, \ \~^gb, ECED{e2)\\) 

where gb = ROBDD{b). The normalization operator ||(/, /|| takes as input a 
boolean guard g and an FCED / and returns another FCED as follows: 
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\\gJ\ 

\\gJ\ 

\\g,F{hj2)\ 

\\g,Choose{fi,f2)\ 

\\g,Guardg>{f)\ 



Guardg(f), if BV{g) BV{f) 

Guardg{f[g]), if g is a conjunction of literals 

F{\\gJi\\,\\gj2\\) 

Choose{\\g,fi\\,\\g,f2\\) 

Guardg'{\\gJ'\\), ifBV(g') A BV{g) 
Wg^g'jf'W otherwise 



where f[g], BV{g) and BV{f) are as defined before in Section 2 . 2 . 



3.3 Randomized Equivalence Testing 

The hash values assigned to nodes of FCEDs of expressions in the language £„ 
are vectors of k rationals, where k is the largest depth of any expression that 
arises. For the purpose of assigning hash values, we choose a random value ry for 
each variable y and two random k x k matrices M and N . The following entries 
of the matrices M and N are chosen independently of each other and uniformly 
at random from some set of rationals: and 

for all 2 < t < fc. The rest of the entries are chosen to be 0 . The hash value V (n) 
is assigned inductively to any node n in an FCED as follows: 

V(x) = [r^,..,r^] 

V{F{h,f2)) = V{h) xM + V{f2) X N 
V{Ghooseih,f2)) = U(/i) + V{f2) 

V{Guardg{f)) = H{g) x V{f) 

where H{g) is as defined before in Section 2 . 3 . Note that H{g) x V{f) denotes 
multiplication of vector V{f) by the scalar H{g), while V{fi) x M denotes 
multiplication of vector U(/i) by the matrix M. 

The proof of property T 3 (given in the full version of the paper [ 20 ]) explains 
the reason behind this fancy hashing scheme. Here is some informal intuition. To 
maintain multi-linearity, it is important to choose a random linear interpretation 
for the uninterpreted function F. However, if we let A: = 1 , the hashing scheme 
cannot always distinguish between non-equivalent expressions. For example, con- 
sider Cl = F{F{xi,X2),F{x3,X4)) and C2 = F(F(xi, X3), F(x2, X4)). Note that 
Cl ^ 62 but V{FGED{e\)) = V{FGED{e2)) = r^^mf + rx^mn + rx3nm + rx4n'^ , 
where m and n are some random rationals. This happens because scalar multi- 
plication is commutative. This problem is avoided if we work with vectors and 
matrices because matrix multiplication is not commutative. 

3.4 Completeness and Probabilistic Soundness of the Algorithm 

Let e be any expression in language Let P{FGED{e)) denote the k*^ poly- 
nomial in the symbolic vector obtained by using variable names x and c instead 
of random values rx and Tc, and by using variable names Mij and Nij instead 
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of random values for the matrix entries, while computing V{FCED(e)). The 
properties T1,T2,T3 stated in Section 2.4 hold here also. Properties T1 and T2 
are easy to prove as before. However, the proof of property T3 is non-trivial, 
and is given in the full version of the paper [20]. These properties imply that 
the randomized equivalence testing algorithm is complete and probabilistically 
sound as before. The error probability is bounded above by where d and s 
are as mentioned in Section 2.4. 



3.5 Time and Space Complexity 

The time required to compute the hash value for an FCED / is 0(n x fc) where 
n is the size of / and k is the size of the largest FCED in the context. The 
time and extra space T(e) required to construct FCED of an expression e in 
language £„ from the FCED of its sub-expressions can be estimated similarly 
as in Section 2.5, and is shown in Figure 6. 



4 Comparison with Related Work 

Path-insensitive version of the analyses that we have described in this paper 
have been well studied. Karr described a polynomial-time abstract interpreta- 
tion based algorithm [22] to reason precisely about linear equalities in a program 
with non-deterministic conditionals. Recently, we described a more efficient algo- 
rithm based on the idea of random interpretation [18]. Several polynomial-time 
algorithms have been described in literature for value numbering, which is the 
problem of discovering equivalences among program expressions when program 
operators are treated as uninterpreted [1,26]. All these algorithms are complete 
for basic blocks, but are imprecise in the presence of joins and loops in a program. 
Recently, we described algorithms for global value numbering that discover all 
equivalences among expressions under the assumption that all conditionals are 
non-deterministic and program operators are uninterpreted [19,21]. 

Karthik Gargi described a path-sensitive global value numbering algorithm 
[16] that first discovers equivalent conditionals, and then uses that information 
to do a simple predicated global value numbering. However, this algorithm is 
not complete and cannot handle conditionals as precisely as our algorithm. Our 
algorithm is complete with respect to the abstraction of conditionals to boolean 
variables. Gargi’s algorithm treats all operators as uninterpreted and hence does 
not handle linear arithmetic. 

The model checking community has been more concerned with path-sensiti- 
vity, in an attempt to do whole state-space exploration. The success of ROBDDs 
has inspired efforts to improve their efficiency and to expand their range of appli- 
cability [7]. Several generalizations of ROBDDs have been proposed for efficient 
boolean manipulation [2, 17]. There have been some efforts to extend the con- 
cept to represent functions over boolean variables that have non-boolean ranges, 
such as integers or real numbers (e.g. Multi Terminal Binary Decision Diagrams 
(MTBDDs) [3,9], Edge- Valued Binary Decision Diagrams (EVBDDs), Binary 
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Handles Dependent 
Conditionals 


Handles 

Arithmetic 


Handles Independent 
Conditionals 


Randomized or 
Deterministic 


ROBDD 


good 


no 


good 


deterministic 


MTBDD 


good 


poor 


poor 


deterministic 


FBG 


no 


no 


good 


randomized 


RI 


no 


good 


good 


randomized 


FCED 


good 


good 


good 


randomized 



Fig. 7 . A table comparing different data structures for software model-checking. 

Moment Diagrams (BMDs) [6] and Hybrid Decision Diagrams (HDDs) [8]). Mul- 
tiway Decision Graphs (MDGs) have been proposed to represent quantifier-free 
formulas over terms involving function symbols [10]. None of the above men- 
tioned extensions and generalizations of ROBDDs seem well-suited for software 
model checking since they do not directly and efficiently support manipulation of 
conditional expressions, i.e. expressions that are built from boolean expressions 
and expressions from some other theory like that of arithmetic or uninterpreted 
functions. This is because most of these techniques rely on having a canonical 
representation for expressions. Figure 2 illustrates the problems that arise with 
canonicalization. However, our proposed representation, FGED, can efficiently 
represent and manipulate such expressions since it does not require a canonical 
representation. 

The idea behind hashing boolean guards g in our randomized equivalence 
testing algorithm is similar to that used for checking equivalence of Free Boolean 
Graphs (FBG) [5], FBDDs [17] and d-DNNFs [13,12] all of which represent 
boolean expressions. We have extended this line of work with checking equiva- 
lence of conditional arithmetic expressions or conditional expressions built from 
uninterpreted function terms. Similar ideas have also been used in the random 
interpretation (RI) technique for linear arithmetic [18] and for uninterpreted 
function terms [19] for detecting equivalence of conditional expressions that in- 
volve independent conditionals. Figure 7 compares these related techniques. 

5 Conclusion and Future Work 

We describe in this paper a compact representation of expressions involving con- 
ditionals and linear arithmetic (or uninterpreted functions) such that they can be 
compared for equality in an efficient way. In the absence of linear arithmetic and 
uninterpreted functions, our technique behaves like ROBDDs. In fact, FGEDs 
inherit from ROBDDs the precise handling of dependent conditionals necessary 
for discriminating the feasible paths in a program with dependent conditionals. 
However, the main strength of FGEDs is the handling of the portions of the pro- 
gram with independent conditionals. In those situations, the size of FGEDs and 
the time to compare two FGEDs is linear (quadratic for uninterpreted functions) 
in the size of the program. 

The simpler problem involving only independent conditionals can be solved 
in polynomial time by deterministic [22,21] and randomized algorithms [18, 19]. 
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In this special case, randomization brings a lower computational complexity and 
the simplicity of an interpreter, without having to manipulate symbolic data 
structures. Once we allow dependent conditionals, the problem becomes NP- 
hard and we should not expect randomization alone to solve it in polynomial 
time. We show in this paper that randomization can still help even for NP-hard 
problems, if we combine it with a symbolic algorithm. We expect that there are 
other NP-hard program analysis problems that can benefit from integrating the 
symbolic techniques with randomization. 

The next step is to implement our algorithms and compare them with the 
existing algorithms with regard to running time and number of equivalences dis- 
covered. The results of our algorithm can also be used as a benchmark to measure 
the number of equivalences that are missed by path-insensitive algorithms. 

We have presented randomized algorithms for checking equivalence of two 
FCEDs for the languages £„ and Ca- It is an open problem to extend these 
results to the combined language, i.e. the language that involves both conditional 
arithmetic expressions as well as conditional uninterpreted function terms. It 
would also be useful to extend these results to other languages/theories apart 
from linear arithmetic and uninterpreted functions, for example, the theory of 
lists, the theory of uninterpreted functions modulo commutativity, associativity, 
or both. Such theories can be used to model program operators more precisely. 
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Abstract. In this paper we investigate the existence of a deductive ver- 
ification method based on a logic that describes pointer aliasing. The 
main idea of such a method is that the user has to annotate the program 
with loop invariants, pre- and post-conditions. The annotations are then 
automatically checked for validity by propagating weakest preconditions 
and verifying a number of induced implications. Such a method requires 
an underlying logic which is decidable and has a sound and complete 
weakest precondition calculus. We start by presenting a powerful logic 
(wAL) which can describe the shapes of most recursively defined data 
structures (lists, trees, etc.) has a complete weakest precondition calcu- 
lus but is undecidable. Next, we identify a decidable subset (pAL) for 
which we show closure under the weakest precondition operators. In the 
latter logic one loses the ability of describing unbounded heap structures, 
yet bounded structures can be characterized up to isomorphism. For this 
logic two sound and complete proof systems are given, one based on nat- 
ural deduction, and another based on the effective method of analytic 
tableaux. The two logics presented in this paper can be seen as extreme 
values in a framework which attempts to reconcile the naturally oposite 
goals of expressiveness and decidability. 



1 Introduction 

The problem of pointer aliasing plays an important role in the fields of static 
analysis and software model checking. In general, static analyses used in opti- 
mizing compilers check basic properties such as data sharing and circularities 
in the heap of a program, while model checking deals with the evolution of 
heap structures, in both shape and contents, over time. An early result [21] 
shows that precise may-alias analysis in the presence of loops is undecidable. As 
a consequence, the approach adopted by the static analysis community, is the 
abstraction-based shape analysis [23] . This method is effective in the presence of 
loops, since the domain of the analysis is bounded, but often imprecise. In this 
paper we present an orthogonal solution to the aliasing problem, in that preci- 
sion is the primary goal. To ensure termination, we use Floyd’s method [10] of 
annotating the program with pre-, post-conditions and loop invariants. The an- 
notations are subsequently verified by a push-button procedure, that computes 
weakest preconditions expressed using an effectively decidable logic. 
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The key is to find a logic that can altogether (i) express aliasing and shape 
properties of the program heap, (ii) is effectively decidable, and moreover, (iii) 
has a sound and complete weakest precondition calculus with respect to the 
atomic statements. While the second and third requirements are clear, the first 
one is still ambiguous: what kind of specifications can we express in a decidable 
heap logic with weakest preconditions? The contribution of this paper is the 
definition of a formal framework in which we prove that such logics can be 
found. Our focus is on imperative programs with destructive updating, in which 
heaps are viewed as shape graphs with labels only on edges i.e., we ignore from 
the start the internal states of the objects. 

As a starting point, we present a general logic Weak Alias Logic (wAL) that 
is expressive enough to describe the recursive data structures of interest (lists, 
trees, dags etc.) as infinite classes of finite graphs. This logic has also a sound and 
complete weakest precondition calculus with respect to atomic statements such 
as new object creation and assignment of pointers. The satisfiability problem 
of the wAL logic is found to be undecidable but recursively enumerable, which 
motivates further searches for semi-decision procedures and non-trivial decidable 
subsets. 

In the rest of the paper, we define a decidable subset of wAL, called Propo- 
sitional Alias Logic (pAL) for describing pointer aliasing that is, moreover, able 
to characterize arbitrary finite structures and finite classes of structures. The 
tradeoff in defining pAL is losing the ability to describe a number of interesting 
shape properties such as listness, (non)circularity, etc. For this logic, we give 
a proof-theoretic system based on natural deduction, and an effective tableau 
decision method. Both systems are shown to be sound and complete. More- 
over, the satisfiability problem for pAL is shown to be NP-complete. The last 
point concerns the definition, in pAL, of weakest preconditions for imperative 
programs with destructive updating. At this point, we use the wAL weakest 
precondition calculus, previously developped in [2]. Our weakest precondition 
calculus for pAL is sound and complete, as a consequence of the soundness and 
completness of the definitions for wAL weakest preconditions. 



Related Work. To describe properties of dynamic program stores, various 
formalisms have been proposed in the literature e.g., Lr [1], BI (Bunched Impli- 
cations) [13], Separation Logic [22] and PAL (Pointer Assertion Language) [17]. 
As a common point with our work, [1] uses regular expressions to describe 
reachability between two points in the heap and is shown to be decidable, yet the 
weakest precondition calculus is not developed. On the other hand, BI [13] and 
Separation Logic [22] produce remarkably simple preconditions and have quite 
clean proof-theoretic models [18]. Another feature of these formalisms is that 
they allow for compositional reasoning [19]. As a downside, the quantifier frag- 
ment, essential to express weakest preconditions, is undecidable [5], while the 
ground (propositional) fragment is decidable, a tableau procedure being pro- 
posed in [11]. In a later publication [6], a specialization of the ground fragment 
of BI to tree models is used as a type system for a language, based on A-calculus, 
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that handles trees. An effectively decidable formalism is PAL [17], an extension 
of second-order monadic logic on trees that allows to describe a restricted class 
of graphs, known as “graph types” [16], as opposed to our approach that deals 
with unrestricted graphs. Programs that manipulate such graphs are restricted 
to updating only the underlying tree (backbone) . The resulting actions can thus 
be described in monadic second-order logic, and the validity of Hoare triples 
expressed in PAL can be automatically decided [15]. 

The decision procedures for both Lr and PAL use Rabin’s result on the 
monadic second order theory of n successors (SnS) [20]. The decision procedure 
for the satisfiability of SnS is however non-elementary. We show that the decision 
problem for the pAL logic is NP-complete, thus drastically improving the com- 
plexity bounds. Also, to the best of our knowledge, no previously published work 
on the verification of heap properties has the ability to deal with unrestricted (de- 
structively updated) data structures, developing a sound and complete weakest 
precondition calculus on top of a decidable logic for graphs. 

2 Weak Alias Logic 

In this section we introduce Weak Alias Logic (wAL), a logic that is expressive 
enough for defining recursive data structures (lists, trees, etc) as infinite classes of 
finite graphs, as well as for defining a weakest precondition calculus of imperative 
programming languages with destructive updating [2]. This section defines the 
logic, and Section 5 briefly recalls the weakest precondition calculus that has 
been developed on top of it. 

Before giving the syntax of wAL, let us introduce the notion of heap, which 
is central in defining interpretations of wAL formulas. Intuitively, a heap is rep- 
resented by a graph where the nodes model objects and the edges model pointers 
between objects. The heap edges are labeled with symbols from a given alpha- 
bet S, which stands for the set of all program pointers, including all program 
variables and record fields (selectors). It is furthermore required that the graph 
be deterministic, as a program pointer can only point to one object at a time. 

In this paper we adopt the storeless representation [2], [12], [14], [8] of a graph, 
in which each node is associated the language recognized by the automaton whose 
set of states is given by the set of graph nodes, the transition relation by the 
set of edges, the initial state is a designated entry point in the heap, and the 
unique final state, the node itself. The interested reader is referred to [2] for a 
detailed discussion on the advantages of the storeless representation of heaps, 
such as compatibility with garbage collection and isomorphic transformations. 

Definition 1 (Heap). A heap M C V{S~^) is either the empty set or a finite 
set {Xi,X 2 , . . . ,Xn} satisfying the following conditions, for all 1 < i,j < n: 

(Cl) non- emptiness: Xi 0, 

( C2) determinism: i j ^ Xi f] Xj = 0, 

(C3) prefix closure and right regularity: 

Va; e Xi [Vy, 2 G S'^[x = l<fc<n [yGA^A XkZ C A^]]] 
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One can also think of a heap element as the set of all incoming paths lead- 
ing to it, paths that start with a program variable. The (C1),(C2) and (C3) 
restrictions must be imposed on the elements of a heap in order to maintain the 
correspondence (up to isomorphism) with the graph model [2]. An equivalent 
approach, taken in [14], [8], is to consider the languages in the heap as equiva- 
lence classes of a right-regular relation on E* x E* . The set of all heaps over an 
alphabet E is denoted in the following by ’H{E). 

Figure 1 introduces the abstract syntax (upper part) and semantics (lower 
part) of the wAL logic. The terms of a wAL formula are regular expressions p 
over the alphabet E with free variables from a set Var. We allow the classical 
composition operations on regular expressions, together with the left derivate, 
denoted by Pi^p2 = {a G E* \ p\a D p2 yf 0} Formulas are built from 
the atomic propositions p\ = p 2 (language equivalence) and {X)pi (modality) 
connected with the classical first-order operators A, ^ and 3. A less usual re- 
quirement is imposed on the syntax of the existential quantifier: the quantified 
variable need to occur at least once within the angled brackets of a modality 
in the scope of the quantifier, which is formally captured by the ’p{X). Notice 
also that only free variables can occur inside the modality brackets. A formula 
(/? is said to be closed if no variables occur free i.e., FV{(p) = 0, where FV is 
defined recursively on the syntax, as usual. We define VA . ip = -<3X . —<ip, 

(fiV P 2 = A ~^P 2 ), and ^ P 2 = V p 2 - The set of all wAL formulas 

over the alphabet E is formally denoted by wAL[A]. 



vGE\XGVar 
Pi = P2 {X)pi 


1 ^ 1 pi • p 2 1 p* 1 pi u P 2 I pi n p 2 1 p h 
Pi A :p 2 -'(p 3A . (fi{x) 


M 


G 


H{E), v.Var->V{S*) 


[pL 


A 


p[y(FV{p))/FV(p)] 


[pi = P-2\m,^ = 1 




II 


[(^)pll^,. = 1 




■ v{X) G M and v{X) n [pi],, yf 0 


= 1 




. 3 p G F(r*) . [p|^^ = 1 



Fig. 1. Weak Alias Logic 

A wAL formula is interpreted with respect to a heap M. and a valuation 
V assigning free variables to languages. The only non-standard operator is the 
modality {X)px, where X is bound to denote a heap entity which intersects (the 
interpretation of) pi. As a consequence of the syntactic restriction imposed on 
the existential quantifier, all variables in a closed formula are bound to heap 



^ Intuitivelly, we need the left derivate to describe paths between two objects in the 
heap. If X and Y are two objects in a heap, then X~^Y is the language of all paths 
between X and Y . 
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entities^. A heap A4 is said to be a model for a closed wAL formula (p if and 
only if _L = 1 - where p has at least one model, it is said to be 

satisfiable. 

At this point, the reader can notice an embedding of wAL into the Monadic 
Second Order Logic on graphs. Indeed, a wAL formula is composed of equiva- 
lences of regular expressions {p\ = P2) related using first order connectives. Such 
equivalences can be described by finite automata which, in turn, can be specified 
in MSOL. However, we found using regular expressions, instead of MSOL, more 
intuitive for the specification of heap properties, as it is shown in the following. 



1 Path properties | 


reach{X, Y) 


{Y)XX+ 


next{X^ Y) 


{Y)XE A VY' . {Y’)XE -^Y = Y' 


linear{X, Y) 


reach{X, Y) = Y A {X = Z y reach{X, Z)) A 

reach{Z, Y) 3Z' . ->Z' = Z A next{Z, Z') 


cycle(X, Y) 


reach[X, Y) A reach(Y, X) 


share{X, Y) 


. reach{X, Z) A reach{Y, Z) 


1 Recursive data structures | 


nclist(heo,d) 


yX . {X)head 3Y . (Y)Xnext* A linear(X, Y) A ->cycle{Y, Y) 


dlist{head, next, prev) 


yX,Y3Z . {{X)head ^ ^{Y)Xprev)A 
{{Z) Xnext =► X ^ Z A {X)Zprev) 


tree{root) 


yX.{X)root -X Vy, Z . (reach{X, Y) A reach{X, Z)) —>■ ->share(Y, Z)) 


dag(root) 


3X . {X)root vy, Z . reach{X, Y) A reach{X, Z) -x ^cycle{Y, Z) 



Fig. 2. Expressing properties of heaps 



The properties in Figure 2 describe various paths in the structure. We con- 
sider the predicate reach{X,Y) stating that node Y is reachable from node X 
by some non-empty path. A node Y is said to be next to a node A if Y is the 
only neighbor of X . A path from A to Y is linear if there is no branching i.e., 
if all the nodes on the path have only one successor. The existence of a cycle 
containing both A and Y is given by the cycle{X, Y) predicate. 

The wAL logic can also describe the shapes of most typical recursive data 
structures used in programming languages with dynamic memory allocation: 
lists, trees, dags, etc. For instance, non-cyclic simply-linked lists pointed to by the 
head variable and using the next field as forward selector, are being described by 
the nclist predicate. Doubly-linked lists pointed to by the head variable and using 
the next and prev field pointers as forward and backward selectors, respectively, 
can be captured by the dlist predicate. Some data structures, such as trees, 
require the absence of sharing. A sharing predicate expressing that A and Y 
belong to two structures that share some node can be given by share{X,Y). A 
tree structure pointed to by a variable root is described by the tree formula. A 



^ This syntactic restriction on the quantification domain was mainly snggested by 
the fact that, allowing quantification over V{S*) makes the logic undecidable even 
when modalities are not used at all in formulas. A formal proof will be included in 
an extended version of this paper. 
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dag structure in which every node is reachable from a root variable is given by 
the dag formula. 

2.1 Undecidability of wAL 

The result of this section comes with no surprise, in the light of similar undecid- 
ability results for logics able to express graph properties such as e.g, the logic of 
Bunched Implications (BI) [5], and Monadic Second-Order Logic of graphs [7]. 
Given along the same lines as the undecidability proof for BI [5], our proof for 
wAL relies on a classical result in finite model theory [9], namely that the first 
order logic interpreted over finite structures is undecidable. 

Given a vocabulary V of relation symbols, let FO[V] be the set of first-order 
formulas with symbols from V. For each relation symbol R G V, let #(i?) denote 
its arity i.e., its number of arguments. Let V = {i?i, . ■ . , Rn} for the rest of this 
section. We interpret first-order formulas over structures A = {A, R-f, . . . , R:^), 
where A is the universe and RA C 1 < z < n are the interpretations of 

the relation symbols from V over A. A structure is said to be finite if and only 
if its universe is finite. Given a valuation v : FV (tp) — > A of the free variables in 
a formula (p G FO[V], we denote by ^ the interpretation of p in A. We say 
that A is a model of a closed first-order formula p if and only if \p\j^ ax _l = f • 
It is known that the problem of finding a finite model for a closed FO [V] formula 
is undecidable [9]: 

Theorem 1 (Trahtenbrot’s Theorem). Let V be a vocabulary with at least 

one symbol of arity two or more. Then the set S'at[V] = {p G FO[V] | FV(p) = 
0 , p has a finite model } is not decidable. 

Given an arbitrary first order formula, we shall translate it into a wAL for- 
mula such that satisfiability is strongly preserved by the translation. Gonsidering 
that V = {i?i, . . . , Rn}, we define ify = {an, ..., \ 1 < z < n} U { 7 }. 

That is, for each relation symbol of arity k we consider k different a-symbols 
and a /3-symbol in F/y. The translation is given by the recursive function 0 : 
FO\V] wAL[F7v], defined as: 

&{RAXi,...,X#(np))= AX . {X)E* !^Xi)Xaki 

0{X = Y)= X = Y e{pl^p2) = 0{pl)^O{p2) 

^ -,0{p) 0{3X .p)=3X. {X)E* A 0{p) 

Note that the translation of a closed first-order formula respects the syntactic 
constraints of wAL, that each quantified variable must occur inside the brackets 
of a modality, and that only a variable can occur on this position. Moreover, a 
closed first-order formula translates into a closed wAL formula. Now it remains 
to be shown that the translation strongly preserves satisfiability. We remind that 
satisfiability for wAL is implicitly defined on finite models (Definition 1). Due 
to space constraints, all proofs are deferred to [3]. 

Lemma 1. A closed first-order formula p is finitely satisfiable if and only if 
0{p) is satisfiable. 
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Considering for the moment that the alphabet S is sufficiently large to code 
the vocabulary V of a given first order logic, Theorem 1 and Lemma 1 lead 
immediately to the following result. 



Theorem 2. For a sujficiently large alphabet S , the set Sat[S] = {(p e wAL[L'] | 
FV (<p) = ip has a model } is not recursive. 

Since Theorem 1 holds for vocabularies containing at least one relation sym- 
bol of arity two, by the definition of E\> it follows that Theorem 2 holds for 
generic heaps over alphabets of size at least four. Here, a more refined heap 
model could provide us with more intuition in identifying classes of heaps over 
which the satisfiability problem becomes decidable. For instance, considering 
T' = i7Ul7, 77ni7 = 0, ||I7|| = 1 and all heaps of the form M. C V{F[ x f2*) i.e., 
heaps consisting only of (possibly circular) singly linked lists. In this simple case, 
we propose to revisit the decidability of the satisfiability problem for wAL. 

In order to show that the satisfiability problem for wAL is recursively enu- 
merable, let us first consider the model checking problem. The model checking 
problem asks whether a given heap At is a model for a formula 4’- This problem 
is decidable, by the fact that any heap model is finite. The interested reader 
is referred to [4] for an algorithm. But the set 'H(E) of all heaps over a finite 
alphabet is enumerable. Hence, if a given formula ^p is satisfiable, an algorithm 
that enumerates all models Mi, M 2 , ■ ■ ■, testing whether each Mi is a model of 
ip, will eventually stop. 

Lemma 2. For every finite S, the set Sat[E] is recursively enumerable. 

An interesting open problem is then how to find useful semi-decision proce- 
dures for wAL. 



3 Propositional Alias Logic 

The negative result from the previous section motivates the search for decidable 
subsets of wAL that are able to express meaningful properties of heaps. One 
basic property encountered in many applications is data sharing. In this section 
we define a simpler logic based directly on the notion of aliasing of finite heap 
access paths (Propositional Alias Logic, or pAL for short). The rest of this 
paper is concerned with the study of pAL from three perspectives: proof theory, 
automated reasoning and program logic. The ability of pAL to express other 
heap properties besides aliasing, is also investigated. 

Figure 3 defines the abstract syntax (upper part) and the semantics (lower 
part) of pAL. The terms are finite words over an alphabet S, with wp^W 2 being 
the suffix of W 2 that, concatenated with wi, yields W 2 , if such suffix exists, or the 
empty word e, otherwise. The atomic propositions are the prefix test (wi < W 2 ) 
and the alias proposition (W 1 OW 2 ). Formulas are built from atomic propositions 
connected with the propositional operators A and In the syntax definition, T 
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vj := V G E \ wi ■ VJ2 \ vji ^W 2 
(fi ■.= Wl < W2 \ WlOw2 I <^1 </52 I -'</9 I A 

[uiiOw 2 ]^ = 1 <S=^ 3X G M ■ wi,W2 G X 

Fig. 3. Propositional Alias Logic 



denotes the false literal^. The set of all pAL formulas over the alphabet E is 
formally denoted by pAL[A']. 

The semantics of pAL is defined with respect to a heap M . An alias propo- 
sition w\ Ow 2 is true if and only if there exists an element of Ai such that both 
terms W\,W 2 belong to it. Note that, since M C V{S^), if either one of the 
terms is e, the alias proposition is false. The intended meaning of wC'w for some 
w G E+ , is to say that w is a well-defined path in the heap. The following seman- 
tic equivalence is a trivial check: wiOw 2 -<=^ 3X . (X)wi A {X)w 2 - The prefix 
relation w\ < W 2 can be encoded in wAL as wf^W 2 0 *, where e = 0 * is a pos- 
sible definition of the empty word in wAL. These considerations justify the fact 
that pAL is a subset of wAL. The embedding is proper (pAL[A] C wAL[I7]), 
since e.g. reachability and linearity are not expressible in pAL. 



3.1 Natural Deduction System 

This section introduces a natural deduction system [25] for pAL that proves to 
be a useful tool in reasoning about aliases. Although later in this paper we adopt 
the automated reasoning view, as opposed to the proof theoretic, a number of 
results from this sections are used in the rest of the paper. The system (Figure 
4) is that of propositional calculus a la Gentzen (rules AE, Al, ^E, ^I, TE, TI) 
to which we add three rules concerning only alias propositions (sufE, sufI and 
sym). For these rules we take T C pAL[A], x,y,z G A+ and t G E*. 



A -i/’ 



x<>y yt<>z 



xtOz 



(sufI) 






(AE) 
r,-<(pG E 



ifi Alp 

hE) 



(A7) E 



xOy , . 

(p 



F, V? b T 
rh 



■Hfi 



E 

hi) 



(El) 



EG ip 

Fig. 4. Natural Deduction System for pAL 



The natural deduction system presented in Figure 4 exhibits a number of 
interesting properties: it is sound, complete and, all proofs of alias propositions 
can be given in a normal form. To formalize these notions, we need further 

® False could have been defined as <p A -ip for an arbitrary formula y>. However an ex- 
plicit definition is preferred for the purposes of the proof theoretic system of Section 

3.1. 
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notation. If p is an alias proposition, we say that F \~pa P if and only if there 
exists a derivation of p with premises in F that uses only the (sufI) , (sufE) and 
(sym) rules. Otherwise, if ip is any formula, we say that F \- ip ii and only if there 
exists a derivation of with premises in F . By Th{F) we denote the theory of F 
i.e., the set of all formulas that can be deduced from it i.e., Th{F) = {ip\ F \- p\. 

Given a finite set of alias propositions, there exists a heap that is a model 
for the entire set. 

Lemma 3. Let F be a set of formulas containing a finite number of alias propo- 
sitions, C 11+ X be a relation on finite sequences, defined as x y 
if and only if F \~pa xOy, and FIp be the set {x \ x ~p x}. Then is a 
total equivalence relation on Ftp, and the quotient Hp/„^^ is a heap. Moreover, 
< k • ||G||, where fc G N zs a constant. 

Note that, for arbitrary sets of formulas, the existence of a model occurs as 
a consequence of the downward closure property^. 



3.2 Expressiveness of pAL 

In this section we investigate the expressiveness of the pAL language. We show 
that any finite heap structure over a finite alphabet can be uniquely characterized 
by a pAL formula. As a consequence, any finite class of heap structures can be 
defined in pAL^ This extends our previous result in [2] , that pAL has the power 
to distinguish between any two non-isomorphic heap configurations®. However, 
the far more interesting question, of whether and how could pAL be extended 
to describe recursive data structures and still preserve decidability, is subject to 
ongoing and future work. 

For the rest of this section, let M = {Ai,...,A„} be a given heap. We 
shall define a formula 4>m such that ^ other heap M' 

such that |(/'Ar]Ar' = have M. = M' . For a finite word w G , we 

denote by Pref{w) the set of all its prefixes, including w. For a set A G M, 
a word w G A is elementary if and only if it has at most two prefixes in A 
and at most one prefix in any other set Y G Mi, Y yf A. Formally, we have 
ElemM{X) = {w G X \ \\Pref{w) n A|| < 2 and VY yf A . \\Pref{w) n Y|| < 1}. 
An important property of the sets of elementary words is finiteness. This results 
as a consequence of the fact that both Ai and E are finite, since the length of any 
w G ElemM(X) is \w\ < ||AI|| -I- 1, thus \\ElemM{X)\\ < | A dangling 
word is a minimal undefined path in Mi. Formally, we define DangM{X) = 
{wa I w G ElemM{X), a G A, wa ^ U-^}- Since ElemM and E are finite, so 
is DangAiiX). With this notation, we define: 

^ Definition 2 in Section 4. 

® Even if a pAL formula, e.g xOy, is in general satisfied by an infinite number of 
heaps. 

® There we proved ony that two structures are isomorphic if and only if they are 
models of the same pAL formulas. 
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rM — u {wOw' I w,w' € ElerriMiX)} U (1) 

xeM 

{^{w<>w') I w S ElemM{X)^w' € Elemyni^)} U (2) 

X,VeM,X^Y 

{^(wOw) I w € EangM(^)} U {^(aOa) | a S \ (^ A^} (3) 

xeM 

This set is constructed as follows: the first component (1) describes each object 
as a set of alias propositions composed of elementary sequences, the second 
component (2) distinguishes between objects using negated alias propositions 
and the third and fourth components (3) describe the dangling sequences. Notice 
that Ej \4 is not minimal, since for instance in (2) it is sufficient to choose only one 
w € ElemM{X) and one w' G ElemM(X). However, it is finite, according to our 
previous considerations. Intuitively, Em contains all the necessary information 
to characterize M, thus we shall take (!)m = A Am - To show that Ad is a model 
of (pM is a trivial but tedious check. That it is indeed the only model, will be 
shown in the rest of this section. 

Lemma 4. Let A4 be a heap with X G A4, and Em be the characteristic set 
defined in the previous. Then the following hold: 

E for each w G X there exists wq G ElemM{X) such that Em k wOwq. 

2. for all w we have Em I — fwOw). 

3. for any x,y G we have |a:Oy]^ = 1 Em k xOy and = 0 

Am I '{xOy). 

Notice that, from the third point of Lemma 4, and since Em is satisfiable, 
hence consistent, we obtain that \<p\m = 1 if and only if Em k (p. Thus, the set 
of formulas that are satisfied by A4 is finitely axiomatisable since Th{EM) = 
W I \t\m = f}> and Em is finite by definition. 

Theorem 3. Let A4 be a heap and 4 >m be the formula f\ Em- If 
then M = M' . 

Example. Given E = {a, b, c}, the heap M = {ab*} composed of one element 
pointed to by a with a b self loop is characterized by the formula aOa6 A ^cOc A 
^acOac. 

4 Tableau Decision Procedure for pAL 

A proof that uses natural deduction is mainly based on manually adding as- 
sumptions in order to reach contradictions (and deleting them afterwards). This 
makes, in general, natural deduction unsuitable for automated reasoning and 
motivates our preference for the method of analytic tableaux [24] , an elegant and 
efficient proof procedure for propositional logic, which we subsequently extend 
to pAL. Traditionally, a tableau for a propositional formula is a tree having p 
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as the root node and subformulas of Lp or negations of subformulas of p as nodes. 
A tableau branch is said to be closed if it contains a formula together with its 
negation, and open otherwise. A tableau is said to be closed if and only if all 
its branches are closed. To check whether a formula p \s a tautology one builds 
the tableau for ^p, and infers that p a tautology if and only if the tableau 
eventually closes. In case at least one branch remains open, a counterexample 
for p can be extracted. 



^{xtOz) . . . ytOz 
^{xOy) 



(Ti) 



xtOy 

xQx 



(T 2 ) 




^{xOy) 

-'{yOx) 



m) 



Pi A P2 
Pl\P2 



in) 



A P 2 ) 
“'‘Pi I -’P2 



in) 





in) 



Fig. 5. Tableau Expansion Rules 



Figure 5 shows the tableau expansion rules for pAL. We consider that 
x,y,z € A+ and t € S* that is, we can apply the rules also for an empty 
suffix {t = e). The tableau is constructed top-down. A rule whose hypothesis 
are of the form p . . .ip (namely Ti and Tg) can be applied at a node, as soon as 
both p and ip are on the path from the root to the current node, order indepen- 
dent. Rule (T5) expands by putting both p\ and p2 on the same branch of the 
tableau, while rule (Tg) creates two new branches, one containing <pi and the 
other one containing p^. All other rules expand by appending their conclusion to 
the current branch. We use rule (Tg) to close a branch, since T does not expand 
any further. Each rule can only be applied provided that its conclusion does not 
already appear on the current branch, otherwise the procedure runs the risk of 
looping forever (for instance, applying one of rules 23^4), without introducing 
any new formulas^. 

Example. Figure 6 presents a sample run of the tableau procedure whose goal 
is to prove that, for some given k G N, <pk = aOab aOab^ is a tautology. 
First, we eliminate the implication: (pi^ = ^{aOab A ^(aOa6^)) and start the 
tableau procedure with -^cp^ as the root node. To the right of each node occurs 
the number of the node(s) used in the hypothesis, followed by the name of the 
rule applied in order to obtain that node. In this example, the tableau closes 
after fc -I- 6 steps. Branching lacks in this tableau because the rule (Tg) is never 
applied. □ 

The tableau expansion rules can be easily understood with the natural de- 
duction rules in mind. For instance, rule (Ti) can be derived using (sufl), (TI) 
and (^I). Rules (T2) and (Tg) are (sufE) and (sufl), respectively, while (T4) 
is easily derived using (sym) and (^/). The rest of the rules correspond to the 
purely propositional part of the natural deduction system and are an easy check. 
This (and the fact that the natural system is sound and complete) ensures that 

^ The definition of a finer notion of redundancy is planned in the full version. 
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[1] 


-i-i(aOa6 A ^{aOab^)) 




[7] -<{ah’^-^Oa) (5,6,Ti) 


[2] 


aOab A *i(aOa6^)) 


(TTt) 




[3] 


aOab 


(2,T5) 




[4] 


—t{aOab^)) 


(2,T5) 




[5] 


abOa 


(3,T3) 


[k-l-5] -i{ab<>a) (5,fc-t-4,Ti) 


[6] 


-i(a6*^Oa) 


(4,T4) 


[k+6] T {5,k + 5,Ts) 




Fig. 6. 


Tableau Example 



the tableau rules are sound i.e., if a tableau started with closes, then ip \s & 
tautology. The dual implication, if is a tautology then every tableau started 
with ~^p will eventually close, will be dealt with in the following. 

Note that the rules in Figure 5 do not cover the entire pAL syntax from 
Figure 3: the atomic propositions of the form x < y are not considered. The 
reason is that such propositions trivially evaluate to either true or false and 
could be eliminated from a formula a priori. For completeness, rules for the 
prefix test are given in [3] . 

The rest of this section is concerned with proving that the tableau method 
is both complete and effective. To handle the tableau rules in an uniform way, 
we use the unified notation of [24]: let an a-rule be one of the rules (Ti... 5 ) and 
/3-rule be the rule (Tq). We denote the premises of a i?-rule by and its 

conclusions by i?i, . . . Rm, where R= a, (3. 

Definition 2. A set of formulas R is said to be downward closed if and only if 
it respects the following conditions: 

— for no x,y G I7+, we have xOy,-i{xOy) G R, 

— for any a-rule, if a\, ... ,an G R, then 5i, . . .^m G R , ^ 

— for any (3-rule, if (3i, ..., (3n G R, then either (3\ G S or . . . or (3m G R. 

A tableau branch is said to be complete if no more rules can be applied to 
expand it. A tableau is said to be complete if and only if each of its branches 
is complete. It is manifest that an open complete tableau branch is a down- 
ward closed set. The following technical lemma is key to showing satisfiability 
of downward closed sets. We recall here the definition of the relation from 
Lemma 3. The following theorem is the main result of this section. 

Lemma 5. Ror any downward closed set of formulas R, ~^{xOy) G R implies 

X y. 

Theorem 4. Any downward closed set of formulas containing a finite number 
of alias propositions is satisfiable. 

The proof of the above theorem uses the model construction technique from 
Lemma 3. The same method can be moreover used to derive a counterexample 
of a non-valid formula, starting from an open tableau branch. Before stating 
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our completeness result for the tableau method, let us show that the method is 
effective. That is, each tableau procedure started with a finite formula as the 
root node, using the rules from Figure 5, eventually terminates. 

Lemma 6. The tableau of a finite formula is finite. 

Besides showing termination of the tableau procedure, the above lemma, 
together with Theorem 4 ensure that the tableau approach is complete. 

Corollary 1. If a formula (p is a tautology then every complete tableau starting 
with eventually closes. 

In the light of the decidability result concerning pAL, we are next investigating 
the time complexity of the above satisfiability problem, and find that it is NP- 
complete. The proof uses Lemma 3 to show that satisfiability is in NP, and a 
reduction from the satisfiability problem for a set of boolean clauses with three 
literals (3-SAT) to show NP-hardness. 

Theorem 5. The satisfiability problem for pAL is NP-complete. 

5 An Effective Program Logic 

In this section we demonstrate the possibility of using pAL as a weakest pre- 
condition calculus for imperative programs with destructive updating. Otherwise 
stated, we show that pAL is closed under applications of the weakest precondi- 
tions predicate transformers. Intuitivelly, this is a consequence of the fact that 
pAL formulas refer to finite portions of the heap, and also that straight-line 
statements affect bounded regions of the heap. Our proof of closure is construc- 
tive i.e., we define weakest preconditions in terms as predicate transformers 
directly on pAL. This is achieved by means of the sound and complete program 
logic defined on top of wAL [2]. Moreover, soundness and completness of the 
pAL weakest precondition axioms are consequences of soundness and complet- 
ness in the case of wAL. 

We consider a simple imperative language consisting of the following three 
atomics statements. Note that the statements of most object-oriented languages 
can be precompiled in this form, possibly by introducing fresh temporary vari- 
ables: 



Stmnt := uv = null | uv = new \ uv = w (where uv ^ w) 

Here v,w G E denote pointer variables, and m G FI* is a (possibly empty) 
dereferencing path. The first statement resets the v field of the object pointed 
to by M, if u yf e, or the v top-level variable, otherwise. This may cause the 
builtin garbage collector recall all non-reachable objects. The second statement 
allocates a fresh object for further uses, and the third statement assigns its left- 
hand side the object pointed to by the right-hand side variable. The syntactic 
constraint that comes with the last statement is due to the following technical 
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problem. The semantics of the assignment is given as the composition of two 
primitive operations: first one removes the v arc from the node pointed to by u, 
and then it assigns it to w. If uw < w and there are no other paths to the cell 
pointed to by w, the garbage collection caused by the first operation removes 
the unreachable cell before the assignment is finished. The requirement uv ^ w 
is however sufficient to ensure that, in practice, this situation never occurs. 

The axiomatic semantics of this language has been introduced in [2] , by defin- 
ing a weakest precondition operator pre on wAL formulas, and is briefly recalled 
here. For any transition relation over a sequence of statements uj G Stmnt~^ , pre 
distributes over conjunction and universal quantification i.e., pre(w, A P 2 ) = 
pre(a',pi) Ap?e(w,p 2 ) and pre(w,VA . p) = VA . pre(u;,p). For total tran- 
sition relations we have p?e(a;,ip) ^pre(u;, ->(p). If, moreover, the transi- 

tion relation is total and deterministic, we have that pre is its own dual i.e., 
pre(w,(p) AA -^pre{ijj , ^ip) . In the latter case, pre distributes over disjunction 
and existential quantification too. These properties of pre for total, determin- 
istic programs allow us to define general inference rules for the precondition 
inductively on the structure of the postcondition. In particular, it is sufficient to 
define preconditions only for modalities, the rest of the atomic propositions in 
wAL being pure i.e., having model-independent denotations. Figure 7 (upper 
part) gives the precondition of primitive storeless operations add, rem and new 
for arbitrary modalities. This is generalized to the statements defined in the 
previous (lower part). 

{3A.A\Swr* =TA(X)(cr\S'wr*)} rem(S,v) {{T)o} 
{3Xx’'(S,T,A) = t/AVi^i, 2 C(^,^)} add(S,v,T) {{U)a} 

where A) = A U Sv{{T~^ S)v)* {T~^ X) 

Y) = Sv{{T~^ S)v)* X) n X = 0 A {Y)y 
V>r*'(A, Y) = Sv[{T-^ S)v)* X) n X 7 ^ 0 A {Y)E* (3) 

{(T = Sv A a n Sv 0) \/ {T)(j} new(S,v) {(T)cr} 



{3S.{S)u Apri{rem{S,v),p>)} uv = null {(p} 

{3S.{S)u Ajyre{rem{S,v),pre{new{S,v),ip))} uv = new {<p} 
{3S3T.{S)u A {T)w Apre{rem{S,v),jyre{add{S,v,T),ip))} uv = w {<p} 

Fig. 7. wAL Weakest Preconditions 



For the rest of this section, let a,T,0,u,v,w denote constant words, and 
X, y, z denote variables ranging over words. We introduce the following notation: 
3x < a . (fi{x) = y rGPref(a) ■ Since CT is a finite word, so is the formula 
on the right. Figure 8 introduces a number of syntactic shorthands, providing 
context-dependent translations from wAL to pAL for them. That is, we do 
not translate the shorthands individually, but rather in an existentially closed 
context. 
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Definition 


wAL 


pAL 


Off : cr G Svit* 


35 . {S)uAa^ 
35 . {S)uA^a, 


3® < cr . xOu Axv < a 
uOu A ^(3a: < cr . xOu A xv < a) 




353T3A . {S)uA{T)wA 
{X)6h(5„ 


3x < cr . xOu Axv < IT A 
w{{xv)~^a)<>d 


: a e Sv(T~^X) 


353T3X . {S)uA{T)wA 
{x)e A 


uOu A wOw A 906 A 
^(3® < cr . xOu Axv < aA 
w((xv)~^<t)09) 




353T . {S)u A {T)w A "/a 


3x < cr 3y < (xv)~^cr . xOuA 
XV < a A wyOu Ayv < [xv)~^a 


7 cr : cr G Sv(T~^ S)vX* 


353T . (5)u A {T)w A ^ 7 ^ 


uOu A wOwA 

-^{3x < (T 3y < {xv)~^cF . xOuA 
XV < a A wyOu Ayv < (xv)~^a) 



Fig. 8. wAL to pAL translation shorthands 



We assert that all translations defined in Figure 8 preserve logical equivalence. 
To convince ourselves of this fact, let us perform the step-by-step derivation for 
the positive form of Ua. The rest of the formulas are translated along the same 
lines. 

35 . (5)u A a,, = 35 . (5)w A ct G SvE* ^ 

35 . 3a; < cr . (5)u A {S)x A xv < a 3a; < ct . xOu A xv < a 

The goal of this section is to prove that the logic pAL is expressive enough 
to characterize the destructive updating program statements considered in the 
previous. The following theorem captures the result. 

Theorem 6. For any sequence of statements to G Stmnt* and any formula 
ip G pAL[F'], we have pre{cij,(f) G pAL[T']. 

The proof proceeds by deriving the weakest precondition for an arbitrary 
alias proposition ctOt (equivalently written in wAL using the embedding rule) 
i.e., applying the rules in Figure 7. The result is then translated back from wAL 
to pAL using the shorthands from Figure 8. Then we can extend the result 
to arbitrary post-conditions using the distributivity properties for pre, and to 
arbitrary sequences of statements by induction on the length of the sequence. 

It is important to notice that the translations from pAL to wAL and back 
are logical equivalences. Since the pre operators defined on wAL formulas are 
sound and complete, according to the development in [2], we can infer the exis- 
tence of a sound and complete weakest precondition calculus also for pAL. 

6 Conclusions and Future Work 

This paper concerns a deductive verification method for aliasing properties in 
imperative programming languages with destructive updating. Starting from 
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previous work on storeless semantics and alias logic with a weakest precondi- 
tion calculus wAL, we show that the satisfiability problem is undecidable but 
recursively enumerable. Next, we focus on a decidable subset pAL that allows 
to express sound and complete weakest preconditions. The kind of properties 
expressible in this logic are related to pointer aliasing, but also arbitrary finite 
heaps can be defined. We give two sound and complete proof systems for pAL, 
one based on natural deduction, and another based on analytic tableaux. The 
satisfiability problem for pAL is shown to be NP-complete. A tool based on the 
pAL framework is planned in the near future. 

The main question related to the existence of a decidable program logic that 
can express non-trivial shape properties of heap is not fully answered. Although 
undecidable, the wAL logic offers a reach framework in which one can define 
decidable fragments having complete weakest precondition calculi. One such ex- 
ample is pAL. A still open question is the existence of a fragment of wAL that 
encompasses pAL, in which one can express properties such as reachability, cir- 
cularity, etc. One such extension, called kAL, is currently under investigation. 
This logic is obtained from pAL, by considering words (over the heap alphabet) 
with integer counters (parameters indicating the repetition of a finite subword) 
and first order quantification over the counters. In this way we can express for in- 
stance the existence of an unbounded next-path between two pointers head and 
tail: 3k . head.{next}’^Otail, a property that is not expressible in pAL. We plan 
an extensive study of this logic, in order to cover both aspects of satisfiability 
and expressiveness. 
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Abstract. Role logic is a notation for describing properties of relational 
structures in shape analysis, databases and knowledge bases. A natural 
fragment of role logic corresponds to two-variable logic with counting 
and is therefore decidable. 

In this paper, we show how to use role logic to describe open and closed 
records, as well as the dual of records, inverse records. We observe that 
the spatial conjunction operation of separation logic naturally models 
record concatenation. Moreover, we show how to eliminate the spatial 
conjunction of formulas of quantifier depth one in first-order logic with 
counting. As a result, allowing spatial conjunction of formulas of quanti- 
fier depth one preserves the decidability of two- variable logic with count- 
ing. This result applies to the two-variable role logic fragment as well. 
The resulting logic smoothly integrates type system and predicate cal- 
culus notation and can be viewed as a natural generalization of the no- 
tation for constraints arising in role analysis and similar shape analysis 
approaches. 

Keywords: Records, Shape Analysis, Static Analysis, Program Verifi- 
cation, Two- Variable Logic with Counting, Description Logic, Types 



1 Introduction 

In [22] we introduced role logic, a notation for describing properties of relational 
structures that arise in shape analysis, databases and knowledge bases. The 
role logic notation aims to combine the simplicity of role declarations [19] and 
the well-established first-order logic. The use of implicit arguments and syntactic 
sugar of role logic supports easy and concise expression of common idioms for de- 
scribing data structures with mutable references and makes role logic attractive 
as a generalization of type systems in imperative languages, without sacrificing 
the expressiveness of a specification language based on first-order logic. 

The decidability properties of role logic make it appropriate for communi- 
cating information to static analysis tools that go beyond simple type checkers. 
In [22, Section 4] we establish the decidability of the fragment RL^ of role logic by 
exhibiting a correspondence with two- variable logic with counting , which was 

* This research was supported in part by DARPA Contract F33615-00-C-1692, NSF 
Grant CCROO-86154, NSF Grant CCROO-63513, and the Singapore-MIT Alliance. 
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shown decidable in [12]. The fragment RL^ is closed under all boolean operations, 
generalizes boolean shape analysis constraints [23] of shape analysis [34, 38] and 
generalizes the non-transitive constraints of role analysis [19]. 

Generalized Records in Role Logic. In this paper we give a systematic 
account of the field and slot declarations of role analysis [19] by introducing a 
set of role logic shorthands that allows concise description of records. Our basic 
idea is to generalize types to unary predicates on objects. Some of the aspects of 
our notion of records that indicate its generality are: 1 ) We allow building new 
records by taking the conjunction, disjunction, or negation of records. 2 ) In our 
notation, a record indicates a property of an object at a particular program point; 
objects can satisfy different record specifications at different program points. As a 
result, our records can express typestate changes such as object initialization [10, 
35] and more general changes in relationships between objects such as movements 
of objects between data structures [19,34]. 3 ) We allow inverse records as a 
dual of records that specify incoming edges of an object in the graph of objects 
representing program heap. Inverse records allow the specification of aliasing 
properties of objects, generalizing unique pointers. Inverse records enable the 
convenient specification of movements of objects that participate in multiple 
data structures. 4 ) We allow the specification of both open and closed records. 
Closed records specify a complete set of outgoing and incoming edges of an 
object. Open records leave certain edges unspecified, which allows orthogonal 
data structures to be specified independently and then combined using logical 
conjunction. 5 ) We allow the concatenation of generalized records using a form 
of spatial conjunction of separation logic, while remaining within the decidable 
fragment of two- variable role logic. 

Separation Logic. Separation logic [16, 33] is a promising approach for speci- 
fying properties of programs in the presence of mutable data structures. One of 
the main uses of separation logic in previous approaches is dealing with frame 
conditions [5,16]. In contrast, our paper identifies another use of spatial logic: 
expressing record concatenation. Although our approach is based on essentially 
same logical operation of spatial conjunction, our use of spatial conjunction for 
records is more local, because it applies to the descriptions of the neighborhood 
of an object. 

To remain within the decidable fragment of role logic, we give in Section 7 
a construction that eliminates spatial conjunction when it connects formulas of 
quantifier depth one. This construction also illustrates that spatial conjunction 
is useful for reasoning about counting stars [12] of the two-variable logic with 
counting C^. To our knowledge, this is the first result that combines two- variable 
logic with counting and a form of spatial conjunction. 

LFsing the Resulting Logic. We can use specifications written in our no- 
tation to describe properties of objects and relations between objects in pro- 
grams with dynamically allocated data structures. These specifications can act 
as assertions, preconditions, postconditions, loop invariants or data structure 
invariants [19,22,26]. By selecting a finite-height lattice of properties for a given 
program fragment, abstract interpretation [9] can be used to synthesize proper- 
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ties of objects at intermediate program points [2,3,14,19,34,37,39]. Decidability 
and closure properties of our notation are essential for the completeness and 
predictability of the resulting static analysis [24]. 

Outline and Contributions. Section 2 reviews the syntax and the semantics 
of role logic [22]. Section 3 defines spatial conjunction in role logic and identi- 
fies its novel use: describing record concatenation. Sections 4 and 5 show how 
to use spatial conjunction in role logic to describe a generalization of records. 
These generalizations are useful for expressing properties of objects and mem- 
ory cells in imperative programs. Section 6 demonstrates that our notation is a 
generalization of local constraints arising in role analysis [19] by giving a natural 
embedding of role constraints into our notation. Section 7 shows how to elimi- 
nate the spatial conjunction connective @ from a spatial conjunction F\ @ F 2 of 
two formulas Fi and F 2 when Fi and F 2 have no nested counting quantifiers; 
this is the core technical result of this paper. As a result, we obtain a decidable 
notation for generalized records that supports record concatenation. 

2 A Decidable Two- Variable Role Logic RL^ 

Figure 1 presents the two- variable role logic RL^ [22]. We proved in [22] that RL^ 
has the same expressive power as the two- variable logic with counting C^. The 
logic is a first-order logic 1) extended with counting quantifiers 3-’^x.F{x), 
saying that there are at least k elements x satisfying formula F{x) for some 
constant k, and 2) restricted to allow only two variable names x, y in formulas. 
An example formula in two-variable logic with counting is 

yx.A{x) => {yy.fix, y) ^ g{x, y)) (1) 

The formula (1) means that all nodes that satisfy A{x) point along the field / 
to nodes that have exactly one incoming g edge. Note that the variables x and y 
may be reused via quantifier nesting, and that formulas of the form 3^^x. F{x) 
and 3-'^x. F{x) are expressible as boolean combinations of formulas of the form 
3-'^x. F{x). The logic was shown decidable in [12] and the complexity for 
the Cl fragment of (with counting up to one) was established in [30] . We can 
view role logic as a variable- free version of C^. Variable-free logical notations are 
attractive as generalizations of type systems because traditional type systems 
are often variable-free. The formula (1) can be written in role logic as [A 
[/ card-^~g]] where the construct [F] is a shorthand for and 

corresponds to the universal quantifier. The expression denotes the inverse 
of relation g. 

In [22] we show how to perform static analysis with RL^ by observing that 
straight-line code with procedure invocations can be encoded in RL^. When 
loop invariants and procedure specifications are expressed in RL^, the resulting 
verification conditions belong to RL^ and can be discharged using a decision pro- 
cedure. The analysis of sequences of non-deterministic actions, such as partially 
specified procedure calls, is possible because RL^ has a decision procedure that 
is parametric with respect to the vocabulary of sets and relations, which means 
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F 

dG D 
Ag A 
/G^ 
e 



[EQ]e 

l-Fi A -Fbje 
I^'le 
[card^'=F]e 
FiW F2 



::= A \ f \EQ\ Fi A F2\-^F \ F' \ r^F \ card^'=F 

— domain of first-order structure (set of all objects) 

— unary predicates (sets) 

— binary predicates (relations) 

:: ({1, 2} ^ D)U{A^ bool) U {F ^ ^ bool) 

= eM(el) |/]e = e/(e2,el) 

= (e2) = (el) 

= (Ifile) A ([F 2 ]e) l^Fje = ^HFje) 

= IFl(e[l^(e2)l) [-FJe = IFl(e[l ^ (e 2), 2 ^ (e 1)]) 
= |{dei3| [Fl(e[l^d,2^(el)l)}| >fc 
= ^{—'Fi A “ 1 -F 2 ) -Fi => -F 2 = ^Fi V F 2 

Fig. 1. The Syntax and the Semantics of RL^ 



that the intermediate program states can be modelled by introducing a fresh 
copy of the state vocabulary for each program point. Moreover, given a family 
of abstraction predicates [34] expressible in RL^, the techniques of [24,39] can 
be used to synthesize loop invariants. 

In this paper, we focus on the use of role logic to describe generalized records. 
The results of this paper further demonstrate the expressive power of RL^, and 
the appropriateness of RL^ as the foundation of both the constraints supplied 
by the developer, and the constraints synthesized by a static analysis. 

3 Spatial Conjunction 

This section introduces our notion of spatial conjunction @. To motivate our use 
of spatial conjunction, we first illustrate how role logic supports the description 
of simple properties of objects in a concise way. 

Example 1. The formula [/ A\ is true for an object whose every /-field points 
to an A object, the formula \g B] means that every g-field points to a i? 
object, so [f ^ A] A [g ^ B] denotes the objects that have both / pointing to 
an A object and g pointing to a i? object. Such specification is as concise as the 
following Java class declaration class C { A f; B g; }. 

Example 1 illustrates how the presence of conjunction A in role logic enables 
the combination of orthogonal properties such as constraints on distinct fields. 
However, not all properties naturally compose using conjunction. 

Example 2. Consider a program that contains three fields, modelled as binary 
relations /, g, h. The formula Pf = (card“^/) A (card“°(g V h)) means that 
the object has only one outgoing /-edge and no other edges. The formula Pg = 
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[-Fi ® f2le = 3 ei, 62. split e [ei 62] A [F’i]ei A [^2162 
split 6 [ei 62] = 

yAGA.ydGD. {eA)d {ei A) d \/ {e2 A) d A -■((ei A) d A (62 j 4 ) d) A 

V/ G T. Vdi,d2 G D. 

e/(di,d2) (ei / (di, d2) V 62 / (di,d2)) A -1(61 / (di, d2) A 62 / (di, d2)) 

emp = [[ A A A ^/]] 

AeA 

priority: A binds strongest, then then V; F G means Ve. [FJe = [G]e 

(Fi®F2)®F3 « Fi®{F2® F3) F'®emp « emp®F’ « F 

Fi® F2 ~ F2® F\ F\ ®(F2 V F3) ~ F\®F 2 \/ F\®Fz 



Fig. 2. Semantics and Properties of Spatial Conjunction ® 



(card^^g) A (card^'’(/ V h)) means that the object has only one outgoing g-edge 
and no other edges. If we “physically join” the two records, each of which has 
one field, we obtain a record that has two fields, and is described by the formula 
Pfg = (card=\/) A (card A (card /zj. Note that it is Tiot the case that 

Pfg « P/ A Pg. In fact, no boolean combination of P/ and Pg yields P/g. 

Example 2 prompts the question: is there an operation that allows joining spec- 
ifications that will allow us to combine Pf and Pg into Pfg"! Moreover, can we 
define such an operation on records viewed as arbitrary formulas in role logic? 

It turns out that there is a natural way to describe the set of models of formula 
Pfg in Example 2 as the result of “physically merging” the edges (relations) 
of the models of P/ and the models of Pg. The merging of disjoint models of 
formulas is the idea behind the definition of spatial conjunction @ in Figure 2 . 
The predicate (split e [ei 62]) is true iff the relations of the model (environment) 
e can be split into ei and 62. The idea of splitting is that each unary relation 
(e A) is a disjoint union of relations (ci A) and (c2 A), and similarly each binary 
relation (e/) is a disjoint union of relations (ei /) and (62/). For split e [6162] 
we also require that the domain D of objects is the same in all of Ci, 62, and e. 
If we consider models e as graphs, then our notion of spatial conjunction keeps a 
fixed set of nodes, and splits the edges of the graph^, as illustrated in Figure 2 . 
The notion of splitting generalizes to splitting into any number of environments. 
Having introduced spatial conjunction ©, we observe that for Pf, Pg, and Pfg 
of Example 2 , we simply have P/g = Pf ®Pg. 

See [22, Page 6] for a comparison of our notion of spatial conjunction with [16]. 
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4 Field Complement 

As a step towards a record calculus in role logic, this section introduces the 
notion of a field complement, which makes it easier to describe records in role 
logic. 

Examples. Consider the formula P/ = (card“^/) A (card“°(^V /i)) from Exam- 
ple 2, stating the property that an object has only one outgoing /-edge and no 
other edges. Property Pf has little to do with g or h, yet g and h explicitly occur 
in Pf. Moreover, we need to know the entire set of relations in the language to 
write Pf, if the language contains an additional field i, the property Pf would 
become Pf = (card^^/) A (card “°(5 V V i)). Note also that is not the same 
as gV hy i, because ~^f computes the complement of the value of the relation 
/ with respect to the universal relation D^, whereas gV hV i is the union of all 
relations other than /. 

To address the notational problem illustrated in Example 3, we introduce the 
symbol edges, which denotes the union of all binary relations, formally edges = 
Vg g, and the notation — / {field complement of /), which denotes the union of all 
relations other than /, formally — / = \J g,f^f 9- This additional notation allows us 
to avoid explicitly listing all fields in the language when stating properties like Pf. 
Formula Pf from Example 3 can be written as Pf = (card“^/) A (card“°— /), 
which mentions only /. Even when the language is extended with additional 
relations, Pf still denotes the intended property. Similarly, to denote the property 
of an object that has outgoing fields given by Pf and has no incoming fields, we 
use the predicate Pf A card“°~edges. 

5 Records and Inverse Records 

In this section we use role logic with spatial conjunction and field complement 
from Section 4 to introduce a notation for records and inverse records. 

multifield: f^A = card^°(— /V (/ A ~iA)) 

field; /A A = card" (A A /) A f^A 

s of the form =k, <k, or >k, for k € {0, 1 , 2 ,.. .} 

multislot: A-d- f = card^°(~— /V (~/A-iA)) 

slot: A A /= card" (A A ~/) A A-fi^ f 

s of the form =k, <k, or >k, for k € {0, 1 , 2 ,.. .} 

Fig. 3. Record Notation 



The notation for records and inverse records is presented in Figure 3. A multifield 
predicate f Ais true iff the object has any number of outgoing /-edges termi- 
nating at A, and no other edges. Dually, a multislot predicate A ^ / is true iff 
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the object has any number of incoming /-edges originating from A, and no other 
edges. We also allow notation f A where s is an expression of the form =k, 
<k, or >k. This notation gives a bound on the number of outgoing edges, and 
implies that there are no other outgoing edges. We similarly introduce A<— f . A 
closed record is a spatial conjunction of fields and multifields. An open record is 
a spatial conjunction of a closed record with True. While a closed record allows 
only the listed fields, an open record allows any number of additional fields. In- 
verse records are dual to records, and we similarly distinguish open and closed 
inverse records. We abbreviate f^Ahyf^A and A 7^ / by A ^ /. 

Example 4- To describe a closed record whose only fields are / and g where 
/-fields point to objects in the set A and g-fields point to objects in the set 
B, we use the predicate Pi = f ^ A ® g ^ B. The definition of Pi lists all 
fields of the object. To specify an open record which certainly has fields / and g 
but may or may not have other fields, we write P 2 = f^A® g ^ B ®True. 
Neither Pi nor P2 restrict incoming references of an object. To specify that 
the only incoming references of an object are from the field h, we conjoin P2 
with the closed inverse record consisting of a single multislot True<— h, yielding 
the predicate P3 = P2 A True «— ft,. To specify that an object has exactly 
one incoming reference, and that the incoming reference is from the ft field and 
originates from an object belonging to the set C, we use P4 = P2 A C ^h. 
Note that specifications P 3 and P 4 go beyond most standard type systems in 
their ability to specify the incoming (in addition to the outgoing) references of 
objects. 

6 Role Constraints 

Role constraints were introduced in [18,19]. In this section we show that role 
logic is a natural generalization of role constraints by giving a translation from 
role constraints to role logic. A logical view of role constraints is also suggested 
in [20,21]. A role is a set of objects that satisfy a conjunction of the following 
four kinds of constraints: field constraints, slot constraints, identities, acyclicities. 
In this paper we show that role logic naturally models field constraints, slot 
constraints, and identities^. 

Roles Describing Complete Sets of Fields and Slots. Figure 4 shows 
the translation of role constraints [19, Section 3] into role logic formulas. The 
simplicity of the translation is a consequence of the notation for records that we 
have developed in this paper. 

Simultaneons Roles. In object-oriented programs, objects may participate in 
multiple data structures. The idea of simultaneous roles [19, Section 7.2] is to 
associate one role for the participation of an object in one data structure. When 
the object participates in multiple data structures, the object plays multiple 

^ Acyclicities go beyond first-order logic because they involve non-local transitive clo- 
sure properties. 
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roles. Role logic naturally models simultaneous roles: each role is a unary pred- 
icate, and if an object satisfies multiple roles, then it satisfies the conjunction 
of predicates. Figure 5 presents the translation of field and slot constraints of 
simultaneous roles into role logic. Whereas the roles of [19, Section 3] translate 
to closed records and closed inverse records, the simultaneous roles of [19, Sec- 
tion 7.2] translate specifications that are closer to open records and open inverse 
records. 

C[fields F\ slots S\ identities /] = C[fields F] A C[slots S'] A 

[identities 7| 

C|f ields /i : Si, : S„| = /i ^ Si ® ... ® f„ ^ S„ 

C[slotS Sl./l, . . . , S„./„] = Si ^/l ® ... ® Sn^fn 
[identities fi.gi, . . . , f„.gnj = A7=iifi 

Fig. 4. Translation of Role Constraints [19] into Role Logic Formulas 

Cl|fields F; slots S; identities 7| = 0[fields F| A Cl[slots S| A 

[identities 7| 

Cl|f ields /i : Si, . . . , /n : Sn|=C[fields fi : S„| ® card"°(V[Li fi) 

Olgi, . . . ,gm slots Si./i, . . . , S„./„|=C[slots Si./i, . . . , S„./n| ® card"“(V'[Li 

Fig. 5. Translation of Simultaneous Role Constraints [19, Section 7.2] into Role Logic 
Formulas 



7 Eliminating Spatial Conjunction in RL^ 

Preserving the Decidability. Previous sections have demonstrated the use- 
fulness of adding record concatenation in the form of spatial conjunction to our 
notation for generalized records. However, a key question remains: is the result- 
ing extended notation decidable? In this section we give an affirmative answer 
to this question by showing how to compute the spatial conjunction for a large 
class of record specifications using the remaining logical operations. 

Approach. Consider two formulas Fi and F 2 in first-order logic with counting, 
where both Fi and F 2 have quantifier depth one. An equivalent way of stating 
the condition on Fi and F 2 is that there are no nested occurrences of quantifiers. 
(Note that we count one application of 3-^a;. P as one quantifier, regardless of the 
value k.) We show that, under these conditions, the spatial conjunction Fi ® F 2 
can be written as an equivalent formula F 3 where F 3 does not contain the spatial 
conjunction operation @. The proof proceeds by writing formulas Fi, F 2 in a 
normal form, as a disjunction of counting stars [ 12 ], and showing that the spatial 
conjunction of counting stars is equivalent to a disjunction of counting stars. It 
follows that adding @ to (full first-order or two-variable) logic with counting 
does not change the expressive power of that logic, provided that the operands 
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of @ have quantifier depth at most one. Here we allow F\ and F 2 themselves to 
contain spatial conjunction, because we may eliminate spatial conjunction in F\ 
and F 2 recursively. Applying these results to two-variable logic with counting 
C^, we conclude that introducing into (7^ the spatial conjunction of formulas 
of quantifier depth one preserves the decidability of (7^. Furthermore, thanks to 
the translations between (7^ and RL^ in [22], if we allow the spatial conjunction 
of RL^ formulas with no nested card occurrences, we preserve the decidability of 
the logic RL^. The formulas of the resulting logic are given by 

F A \ f \ EQ \ Fi A F 2 \ \ F' \ r^F \ card^^'F 

I Fi ® F 2 , if Fi and F 2 have no nested card occurrences 

Note that record specifications in Figure 3 contain no nested card occurrences, 
so joining them using @ yields formulas in the decidable fragment. Hence, in 
addition to quantifiers and boolean operations, the resulting logic supports a 
generalization of record concatenation, and is still decidable; this decidability 
property is what we show in the sequel. We present the sketch of the proof, 
see [25] for proof details and additional remarks. 



7.1 Atomic Type Formulas 

In this section we introduce classes of formulas that correspond to the model- 
theoretic notion of atomic type [29, Page 20]. We then introduce formulas that 
describe the notion of counting stars [12,30]. We conclude this section with 
Proposition 9, which gives the normal form for formulas of quantifier depth one. 

If C = (7i, . . . , Cm is a finite set of formulas, then a cube over C is a conjunc- 
tion of the form (7“^ A . . . (7“™ where G {0, 1}, = C and (7° = ~^C. For 

simplicity, fix a finite language L = AU F with A a finite set of unary predicate 
symbols and F a finite set of binary predicate symbols. We work in predicate cal- 
culus with equality, and assume that the equality “=” , where = ^ IF, is present 
as a binary relation symbol, unless explicitly stated otherwise. We use D to 
denote a finite domain of interpretation and e to denote a model with variable 
assignment; e maps 7l to 2^ , maps F to 2^^^ and maps variables to elements 
of D. Let xi, . ■ ■ ,Xn be a finite list of distinct variables. Let C be the set of all 
atomic formulas F such that FV(F) C {a;i, . . . ,x„}. The set C is finite (in our 
case it has JAln-l- (]lFj -|- l)n^ elements). We call a cube over C a complete atomic 
type ( CAT) formula. From the disjunctive normal form theorem for propositional 
logic, we obtain the following Proposition 5. 

Proposition 5. Every quantifier-free formula F such that FV(F) C 
{xi , . . . , Xn} is equivalent to a disjunction of CAT formulas C such that FV((7) = 

{xi, . . .,Xn}. 

A CAT formula may be contradictory if, for example, it contains the literal 
Xi fr Xi asa, conjunct. We next define classes of CAT formulas that are satisfiable 
in the presence of equality. A general-case CAT (GCCAT) formula describes 
the case where all variables denote distinct values: a GCCAT formula is a CAT 
formula F such that the following two conditions hold: 1) FV(F) = {x \, . . . , Xn}', 
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2) for all 1 < < n, the conjunct Xi = Xj is in F iff i = j. An equality 

CAT (EQCAT) formula is a formula of the form = Xi^ A F, where 

I < i\, ■ ■ ■ ,im < n and F is a GCCAT formula such that FV(F) = {x\, . . . , Xn}- 

Lemma 6. Every CAT formula F is either contradictory, or is equivalent to an 
EQCAT formula F' such that FV(F') = FV(F). 

From Proposition 5 and Lemma 6, we obtain the following Proposition 7. 

Proposition 7. Every quantifier-free formula F such that FV(F) C 
{xi, . . . ,Xn} can he written as a disjunction of EQCAT formulas C such that 
FV(C) = {xi,...,a;„}. 

We next introduce the notion of an extension of a GCCAT formula. Let 
x,xi, . . . ,Xn be distinct variables and F be a GCCAT formula such that 
FV(F) = {x\, . . . ,Xn}- We say that F' is an x-extension of F, and write 
F' G exts(F, x) iff all of the following conditions hold: 1) F A F' is a GCCAT 
formula; 2) FV(F A F') = {x, x\, . . . , Xn}', 3) F and F' have no common atomic 
formulas. Note that if FV(Fi) = FV(F2), then exts(Fi,a:) = exts(F2,a;) i.e. the 
set of extensions of a GCCAT formula depends only on the free variables of the 
formula; we introduce additional notation exts(a;i , . . . ,Xn,x) to denote exts(F, x) 
for FV(F) = {xi, . . .,Xn}- 

To define a normal form for formulas of quantifier depth one, we introduce 
the notion of /c-counting star. If p > 2 is an integer, let be a new symbol rep- 
resenting the co-finite set of integers . . .}. Let Cp = {0, 1, . . . ,p— l,p+}. 

If c G Cp, by 3°a;. P we mean F if c is an integer, and 3 -Px. F if c = p^. 
We say that a formula F has a counting degree of at most p iff the only counting 
quantifiers in F are of the form 3'^x. G for some c G Cp+i. A counting star for- 
mula describes the neighborhood of an object by specifying an approximation of 
the number of objects x that realize each extension. 

Definition 8 (Counting Star Formula). Let x, x \, . . . , and pi, . . . , Pm 

be distinct variables, k > 1 a positive integer, and F a CCCAT formula such 
that FV(F) = {a;i, . . . ,Xn}- A fc-counting star function for F is a function 7 : 
exts(F, a;) ^ Ck+i- A fc-counting-star formula for ^ is a formula of the form 

AjLiVj = Xzj A F A F', w/iere 1 < A,...,Zm < ?T- 

Note that in Definition 8, formula Ajli Uj ~ ^ij A F is an EQCAT formula, and 
formula AJli Uj = ^ij A F A F' is an EQCAT formula for each F' G exts(F, x). 

Proposition 9 (Depth-One Normal Form). Let F be a formula such that F 
has quantifier depth at most one, F has counting degree at most k, and FV(F) C 
{xi, . . . ,Xn}- Then F is equivalent to a disjunction of k-counting-star formulas 
Fc where FV(Fc) = {xi, . . .,Xn}- 

7.2 Spatial Conjunction of Stars 

Sketch of the Construction. Let Fi and F2 be two formulas of quantifier 
depth at most one, and not containing the logical operation @. By Propo- 
sition 9, let Fi be equivalent to the disjunction of counting star formulas 
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Vr=i -^2 be equivalent to the disjunction of counting star formu- 

las VjiiC'2j- By distributivity of ® with respect to V, we have Fi®F2 « 
(V"ii C'l.i) ®(V”=i C'2j) « V"ii V”=i C'l.i ® C'2j - In the sequel we show 

that a spatial conjunction of counting-star formulas is either contradictory or is 
equivalent to a disjunction of counting star formulas. This suffices to eliminate 
spatial conjunction of formulas of quantifier depth at most one. Moreover, if F is 
any formula of quantifier depth at most one, possibly containing ®, by repeated 
elimination of the innermost @ we obtain a formula without @. 

To compute the spatial conjunction of counting stars we establish an alter- 
native syntactic form for counting star formulas. The idea of this alternative 
form is roughly to replace a counting quantifier such as F' with a spatial 

conjunction of k formulas each of which has the meaning similar to F' , and 

then combine a formula F[ resulting from one counting star with a formula 

Br^x. F2 resulting from another counting star into the formula Br^x. © Tjj) 
where © denotes merging of GCCAT formulas by taking the union of their pos- 
itive literals. We next develop this idea in greater detail. 

Notation for Spatial Representation of Stars. Let Ge{x\^ . . . ,Xn) be the 
unique GCCAT formula F with FV(F) = {a:i, . . . , x„} such that the only positive 
literals in F are literals Xi = Xi for 1 < z < n. Similarly, there is a unique 
formula F' € exts(xi , . . . ,Xn,x) such that every atomic formula in F' distinct 
from X = X occurs in a negated literal. Call F' an empty extension and denote 
it empEx(a;i, . . . , x„, x). 

To compute a spatial conjunction of counting star formulas Ci and C2 in 
the language L, we temporarily consider formulas in an extended language L' = 
L U {Bi,B2} where Bi and B2 are two new unary predicates used to mark 
formulas. We use Bi to mark formulas derived from Ci, and use B2 to mark 
formulas derived from C2- For m G { 0 , {!}, { 2 }, { 1 , 2 }}, define 

Mark0(a;) = A ^B2{x) Marki(a;) = Biix) A -iB2{x) 

Mark2(a;) = ~^Bi{x) A B2{x) Marki,2(x) = B\(x) A B2{x) 

Let F' G exts(xi, . . . , x„, a;). Define 

empEx0(a;i, . . . , Xn,x) = empEx(xi, . . . , Xn, x) A Mark0(a;) 

empe(a:i, . . . ,®„) = Ge{xi, . . . ,x„) A Vx. (AILi ® A *0 empEx0(a;i, . ..,x„,x) 

We write empEx0(F, x) for empEx0(a;i, . . . , x„, x) if FV(F) = {xi, . . . , x„}, and 
similarly for empe(F’, x). We write simply empe if F and x are understood. 

We next introduce formulas (|A'DJ^ and dA'D^, which are the building blocks 
for representing counting star formulas. Formula means that F' marked 

with m and empExgdA, x) are the only extensions of F that hold in the neigh- 
borhood of xi,...,x„ (F' may hold for any number of neighbors). Formula 
dF'Dm means that F' holds for exactly one element in the neighborhood of 
xi,...,x„, and all other neighbors have empty extensions. More precisely, let 
F' G exts(xi, . . . , x„, x). Define 

<\F'Ym = Ge{xi, . . . ,Xn) A Vx. ( A)Li ® A ®i) ^ (F' A Mark^ (x) ) V etTi pExg (F, x) 
!\F')rr. = <\FTm A X . ALi ® A A F'AMark„(x) 
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EAF- EQCAT formula, F - GCCAT formula 

SmlF AF A3‘-^x.FiA...A3‘‘'^x.Fl^] = 

= F A JClFj®Xml3‘’^x.Fij®...®Xml3“'^x.F;,j 

= F A (V*. (Ar=i * 7^ *i) ernpEx0(F,a;)) 

Xrr.\^°x. F'j = empe A^p^+^a;. F'] = ® A„[3P. F'] 

A^[3*+a;.F'l = A^[3»®.F']®^F'^:; 

Fig. 6. Translation of Counting Stars to Spatial Notation 

(1) ^TlP®^T2h'^ ^Tl ©T2P,2 

(2) ®^T2^; m © T2h.2 ®r2^2 

(3) (\Tl\)l ®<\T2h © ^2^1, 2 

(4) ®m^2 <\Tiri ®^2^2 ®ri © T2ri,2 

(5) (|TDt empe 

(6) <\T \)2 empe 

Fig. 7. Transformation Rules for Combining Spatial Conjuncts 

where m € {0, {!}, {2}, {1, 2}}. Observe that G@empe « G if G = (|F'D^ or 
G = dF'Dm for some F' and m. Also note that d^^Dm ®d-^^Dm ~ d^^Dm- 
Translation of Counting Stars. Figure 6 presents the translation of counting 
stars to spatial notation. The idea of the translation is to replace F' with 

the spatial conjunction of k formulas d^'D™ @ . . . ©dT'D^ where m G {{!}, {2}}. 
The purpose of the marker m is to ensure that each of the k witnesses for x that 
are guaranteed to exist by d^^Dm ® • • • ®d^"^Dm are distinct. The reason that the 
witnesses are distinct for m A 0 is that no two of them can satisfy Bi{x) at the 
same time for i G m. 

To show the correctness of the translation in Figure 6, define e'" to be the L'- 
environment obtained by extending the L-environment e according to marking 
TO, and FT to be the restriction of the F'-environment ei to the language L. 
More precisely, if e is an L-environment, for to G {0,{1},{2},{!,2}}, define 
the L'-environment e'" by 1) e'" r = e r for r G F and 2) for q G {1,2}, let 
{eBq)d= True qGm A d ^ |e a;i, ..., e a;„}. Conversely, if ci is an F'- 

environment, define the F-environment eTbyFrr = eir for all r G F. Lemma 10 
below gives the correctness criterion for the translation in Figure 6. 

Lemma 10. If e is an L- environment, C a counting star formula in L, and 
tog {{1}, {2}, {1,2}}, then [G]e = 5™|Gle™. 

Combining Quantifier- Free Formulas. Let Gi @ G 2 be a spatial conjunction 
of two counting-star formulas 

Ci=FaFiA 3'’1'1x.Fi{i a ... a 

C2 = FaF2A a ... a 3'’2Aa;.F}’i 

where Fi and F 2 are GCCAT formulas with FV(Fi) = FV(F 2 ) = {xi, . . . ,Xn}, 
EAFi and EAF 2 are EQCAT formulas, and E = AJli Vj = ^ij ■ To show how to 
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transform the formula |Ci] ® 52IC2] into a disjunction of formulas of the form 
‘5 i, 2|C'3], we introduce the following notation. If T is a formula, let S{T) denote 
the set of positive literals in T that do not contain equality. Let T\ G exts(Fi, x) 
and T2 G exts(i^2, a;)- (Note that exts(J^i,a;) = exts(F2, a;).) We define the partial 
operation T1QT2 as follows. The result of T1QT2 is defined iff S'(Ti)n5'(T2) = 0. If 
S'(Ti)nS'(T2) = 0, then T1QT2 = T where T is the unique element of exts(Fi,x) 
such that S{T) = S{Ti) U S{T2). Similarly to ©, we define the partial operation 
Fi 0 F2 for Fi and F2 GCCAT formulas with FV(Fi) = FV(F2) = {x \, . . . , a;„}. 
The result of Fi 0 F2 is defined iff S'(Fi) n S'(F2) = 0. If S{Fi) n S{F2) = 0, then 
Fi 0 F2 is the unique GCCAT formula F such that FV(F) = {xi , . . . , Xn} and 
S{F) = S{Fi) U S{F2). The following Lemma 11 notes that © and 0 are sound 
rules for computing spatial conjunction of certain quantifier-free formulas. 

Lemma 11. //Ti,T 2 G exts(xi, ... t/ien Ti @T2 « Ti QT2. If Fi and 

F 2 are GCCAT formulas with FV(Fi) = FV(F 2 ) = {xi , . . . , Xn}, then Fi® F 2 « 
Fi 0F2. 

Rules for Transforming Spatial Conjuncts. We transform the formula 
5i|C'i] @52|C'2| into a disjunction of formulas of the form 5 i, 2|G3] as follows. 

The first step in transforming C'i@C'2 is to replace /C|Fi] © /CIF2] with 
/C|Fi 0 F2] if Fi 0 F2 is defined, or False if Fi 0 F2 is not defined. 

The second step is summarized in Figure 7, which presents rules for com- 
bining conjuncts resulting from Aip^LFi] and A2|3®^a:.F2] into conjuncts of 
the form Ai_2p^a;.F]. The intuition is that (|TDj(^ and (TD^ represent a finite 
abstraction of all possible neighborhoods of xi, . . . ,Xn, and the rules in Figure 7 
represent the ways in which different portions of the neighborhoods combine us- 
ing spatial conjunction. We apply the rules in Figure 7 modulo commutativity 
and associativity of ®, the fact that emp is a unit for ®, and the idempotence 
of (TDJjj. Rules (1) — (4) are applicable only when the occurrence of Fi © T2 on 
the right-hand side of the rule is defined. We apply rules (1) — (4) as long as 
possible, and then apply rules (5), (6). Moreover, we only allow the sequences of 
rule applications that eliminate all occurrences of (FDi, (TD*, (Tp, (Tp, leaving 
only (|Tp_2 and (Fp 2- The following Lemma 12 gives the partial correctness of 
the rules in Figure 7. 

Lemma 12. If Gi ^ G 2 , then G 2 Gi is valid. 

Define G\ G2 to hold iff both of the following two conditions hold: 1) G2 
results from G\ by replacing /C|Fi] @ /CIF2] with /C|Fi 0F2] if Fi 0F2 is defined, 
or False if Fi 0 F2 is not defined, and then applying some sequence of rules in 
Figure 7 such that rules (5), (6) are applied only when rules (1) — (4) are not 
applicable; 2) G 2 contains only spatial conjuncts of the form (|Fp_2 and (Fp 2- 
From Lemma 12 and Lemma 11 we immediately obtain Lemma 13. 

Q 

Lemma 13. If G\ G 2 , then G 2 Gi is valid. 

The rule for computing the spatial conjunction of counting star formulas is the 
following. If Gi, G2, and G3 are counting star formulas, define 77.(Gi, G2, G3) to 
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hold iff |Ci] @ S 2 IC 2 ] =4> 5 i _2 [C 3 ] • We compute spatial conjunction by replac- 
ing C\ ® C 2 with Vk(Ci C 2 C 3 ) therefore to show the equivalence 

Cl®C2 ~ V ( 2 ) 

TC(Ci.C 2 ,C 3 ) 

The validity of Vk(Ci C 2 C 3 ) ^ (C'i@C' 2 ) follows from Lemma 13 and 

Lemma 10. 

Lemma 14. (Vt?,(Ci C 2 C3) ^ 3 ) ^ (^1 ® ^ 2 ) is a valid formula for every pair of 
counting star formulas C\ and C 2 ■ 

We next consider the converse claim. If \C\ @ C 2 ]e, then there are e\ and 62 such 
that split 66162 , |C'i] 6 i, and |C' 2 ] 62 . By considering the atomic types induced 
in 6 , 61 and 62 by elements in H \ { 6 a;i, . . . , ea;„}, we can construct a sequence 
of ^ transformations in Figure 7 that convert ^ilCi] @ 52 |C 2 ] into a formula 
5 i, 2 |C' 3 ] such that [Caje = True. 

Lemma 15. C\®C 2 \! n(Ci C 2 C 3 ) ^3 ® valid formula for every pair of 

counting star formulas C\ and C 2 ■ 

Theorem 16. The equivalence (2) holds for every pair of counting star formulas 
Cl and C 2 . 

8 Further Related Work 

Records have been studied in the context of functional and object-oriented pro- 
gramming languages [7,13,17,31,32,36]. The main difference between existing 
record notations and our system is that the interpretation of a record in our 
system is a predicate on an object, where an object is linked to other objects 
forming a graph, as opposed to being a type that denotes a value (with values 
typically representable as trees). Our view is appropriate for programming lan- 
guages such as Java and ML that can manipulate structures using destructive 
updates. Our generalizations allow the developers to express both incoming and 
outgoing references of objects, and to allow the developers to express typestate 
changes. 

We have developed role logic to provide a foundation for role analysis [19]. 
We have subsequently studied a simplification of role analysis constraints and 
characterized such constraints using formulas [20,21]. Multifields and multislots 
are present already in [18, Section 6.1] . In this paper we have shown that role logic 
provides a unifying framework for all these constraints and goes beyond them 
in 1 ) being closed under boolean operations, and 2 ) being closed under spatial 
conjunction for an interesting class of formulas. The view of roles as predicates 
is equivalent to the view of roles as sets and works well in the presence of data 
abstraction [27]. 

The parametric analysis based on three- valued logic is presented in [34]. 
Other approaches to verifying shape invariants include [8,11,15,28]. A decidable 
logic for expressing connectivity properties of the heap was presented in [4] . We 
use spatial conjunction from separation logic that has been used for reasoning 
about the heap [6,16,33]. Description logics [1] share many of the properties of 
role logic and have been traditionally applied to knowledge bases. 
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9 Conclusions 

We have shown how to add notation for records to two- variable role logic while 
preserving its decidability. The resulting notation supports a generalization of 
traditional records with record specifications that are closed under all boolean 
operations as well as record concatenation, allow the description of typestate 
properties, support inverse records, and capture the distinction between open 
and closed records. We believe that such an expressive and decidable notation is 
useful as an annotation language used with program analyses and type systems. 
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Abstract. Termination has been a subject of intensive research in the 
logic programming community for the last two decades. Most works deal 
with proving universal left termination of a given class of queries, i.e. 
finiteness of all the possible derivations produced by a Prolog engine from 
any query in that class. In contrast, the study of the dual problem: non- 
termination w.r.t. the left selection rule i.e the existence of one query in 
a given class of queries which admits an infinite left derivation, has given 
rise to only a few papers. In this article, we study non-termination in the 
more general constraint logic programming framework. We rephrase our 
previous logic programming approach into this more abstract setting, 
which leads to a criterion expressed in a logical way and simpler proofs, 
as expected. Also, by reconsidering our previous work, we now prove 
that in some sense, we already had the best syntactic criterion for logic 
programming. Last but not least, we offer a set of correct algorithms for 
inferring non-termination for CLP. 



1 Introduction 

Termination has been a subject of intensive research in the logic programming 
community for the last two decades, see the survey [4]. A more recent look on 
the topic, and its extension to the constraint logic programming paradigm [8, 
9] is given in [14]. Most works deal with proving universal left termination of a 
given class of queries, i.e. finiteness of all the possible derivations produced by a 
Prolog engine from any query in that class. Some of these works, e.g. [11,7, 12] 
consider the reverse problem of inferring classes of queries for which universal 
left termination is ensured. 

In contrast, the study of the dual problem: non-termination w.r.t. the left 
selection rule i.e the existence of one query in a given class of queries which 
admits an infinite left derivation, has given rise to only a few papers, e.g. [3, 
5]. Recently we have also investigated this problem in the logic programming 
setting [13], where we proposed an analysis to infer non-termination. 

In this paper, we study non-termination in the more general constraint logic 
programming framework. We rephrase our approach into this more abstract 
setting, which leads to a necessary and sufficient criterion expressed in a logical 
way and simpler proofs, as expected. Also, by reconsidering our previous work, 
we now prove that in some sense, we already had the best syntactic criterion for 
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logic programming. Last but not least, we offer a set of correct algorithms for 
inferring non-termination for CLP. The analysis is fully implemented^ . 

We organize the paper as follows. After the preliminaries presented in Section 

2, some basic properties related to non-termination for CLP is given in Section 

3. The technical machinery behind our approach is described in Section 4 and 
Section 5. Section 6 concludes. 

2 Preliminaries 

We recall some basic definitions on CLP, see [9] for more details. 

2.1 Constraint Domains 

In this paper, we consider a constraint logic programming language CLP(C) 
based on the constraint domain C := {Sc, ^CiT^c,Tc, solve) ■ 

Sc is the constraint domain signature, which is a pair {Fc,IIc) where Fc 
is a set of function symbols and Uc is a set of predicate symbols. The class of 
constraints Cc is a set of first-order Ac-formulas. The domain of computation 
T>c is a Ac-structure that is the intented interpretation of the constraints and 
Dc is the domain of T>c ■ The constraint theory 7c is a Ac-theory describing the 
logical semantics of the constraints. We suppose that C is ideal i.e. the constraint 
solver, solve, is a computable function which maps each formula in Ce to one of 
true or false indicating whether the formula is satisfiable or unsatisfiable. 

We assume that the predicate symbol = is in Sc and that it is interpreted as 
identity in Dc- A primitive constraint is either the always satisfiable constraint 
true or the unsatisfiable constraint false or has the form p(t) where p G Flc 
and t is a finite sequence of terms in Sc- We suppose that £c contains all the 
primitive constraints and that it is closed under variable renaming, existential 
quantification and conjunction. 

We suppose that T>e and 7c correspond on Cc i-e. 

— T>e is a model of 7c and 

— for every constraint Cc, T>e |= 3c if and only if 7c |= 3c. 

Moreover, we suppose that 7c is satisfaction complete w.r.t. Cc i-e. for every 
constraint c G Cc, either 7c |= 3c or 7c |= ^3c. We also assume that the theory 
and the solver agree in the sense that for every c G Cc, solvc{c) = true if and 
only if 7c 1= 3c. Consequently, as T>c and 7c correspond on Cc, we have, for 
every c G Ce, solvc{c) = true if and only if T>c |= 3c. 

A valuation is a function that maps all variables into Dc- We write Oa 
(instead of cr(0)) to denote the result of applying a valuation a to an object O. 
If c is a constraint, we write 7?c |= c if for every valuation cr, ccr is true in 7?c 
i.e. T>c \=a c. Hence, T>c |= c is the same as T>c |= Vc. Valuations are denoted by 
(T, rj, 9, . . . in the sequel of this paper. 

http : //www.univ-reunion. fr/~gcc/ 
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Example 1 (TZun)- The constraint domain TZu„ has <, <, =, > and > as pred- 
icate symbols, -I-, — , *, / as function symbols and sequences of digits (possibly 
with a decimal point) as constant symbols. Only linear constraints are admitted. 
The domain of computation is the structure with reals as domain and where the 
predicate symbols and the function symbols are interpreted as the usual relations 
and functions over reals. The theory is the theory of real closed fields [16]. 
A constraint solver for TZun always returning either true or false is described 
in [15]. □ 

Example 2 (Logic Programming) . The constraint domain Term has = as pred- 
icate symbol and strings of alphanumeric characters as function symbols. The 
domain of computation of Term is the set of finite trees (or, equivalently, of finite 
terms), Tree, while the theory ETerm is Clark’s equality theory [1]. The interpre- 
tation of a constant is a tree with a single node labeled with the constant. The 
interpretation of an n-ary function symbol / is the function frree '■ Tree^ Tree 
mapping the trees Ti, . . . , T„ to a new tree with root labeled with / and with 
T\, . . . , Tn as child nodes. A constraint solver always returning either true or 
false is provided by the unification algorithm. CLP( Term) coincides then with 
logic programming. □ 

2.2 Operational Semantics 

The signature in which all programs and queries under consideration are in- 
cluded is := with '■= Fc and IIl '■= Tic U iT£ where Wj^, the 

set of predicate symbols that can be defined in programs, is disjoint from Lie- 
We assume that each predicate symbol p in II l has a unique arity denoted by 
arity{p). 

An atom has the form p{t) where p G 11)^, arity {p) = n and t is a sequence 
of n terms in A CLP(C) program is a finite set of rules. A rule has the form 
p{x) ^ coqi{yi), qniVn) where p, qi, . . . , q„ are predicate symbols in TTj^, 
c is a finite conjunction of primitive constraints and x, yi, . . . , ijn are disjoint 
sequences of distinct variables. Hence, c is the conjunction of all constraints, 
including unifications. A query has the form {Q \ d) where Q is a finite sequence 
of atoms and c? is a finite conjunction of primitive constraints. When Q contains 
exactly one atom, the query is said to be atomic. The empty sequence of atoms 
is denoted by □. The set of variables occurring in a syntactic object O is denoted 
Var{0). 

The examples of this paper make use of the language CLP(7?.ii„) and the 
language CLP(Term). Program and query examples are presented in teletype 
font. Program and query variables begin with an upper-case letter, [Head\Tail] 
denotes a list with head Head and tail Tail, and [] denotes an empty list. 

We consider the following operational semantics given in terms of left deriva- 
tions from queries to queries. Let (p{t),Q\d) be a query and r be a rule. 
Let r' := p{x) <— c o B be a variant of r variable disjoint with {p{t),Q\(T) 
such that solvc{x = t A c A d) = true (where x = t denotes the constraint 
Xi = ti A ••• A Xn = tn with x := x\,. . . ,Xn and t := ti, . . . , tn). Then, 
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,Q\x = tAcAd) is a left derivation step with r' as its in- 

r 

put rule. We write S S' to summarize a finite number (> 0) of left derivation 

steps from S to S' where each input rule is a variant of a rule of P. Let Sq be 
a query. A maximal sequence Sq • • • of left derivation steps is called 

ri 1-2 

a left derivation of P U {S'o} if ri, V 2 , ... are rules from P and if the standard- 
ization apart condition holds, i.e. each input rule used is variable disjoint from 
the initial query S'o and from the input rules used at earlier steps. A finite left 
derivation ends up either with a query of the form (□ | d) with 7^ |= 3d (then it 
is a successful left derivation) or with a query of the form {Q \ d) with Q yf □ or 
7c 1= -<3d (then it is a failed left derivation). We say So left loops with respect 
to P if there exists an infinite left derivation of P U {So}. 

2.3 The Binary Unfoldings of a CLP(C) Program 

We say that i7 ^ co B is a binary rule if B contains at most one atom. A binary 
program is a finite set of binary rules. 

Now we present the main ideas about the binary unfoldings [6] of a program, 
borrowed from [2] . This technique transforms a program P into a possibly infinite 
set of binary rules. Intuitively, each generated binary rule i7 <— c o B specifies 
that, with respect to the original program P, a call to {H \ d) (or any of its 
instances) necessarily leads to a call to (B | c A d) (or its corresponding instance) 
if c A d is satisfiable. 

More precisely, let S be an atomic query. Then, the atomic query {A | d) is a 
call in a left derivation of PU {S'} if S =^(A, Q\d). We denote by calls p{S) the 

set of calls which occur in the left derivations of PU{S}. The specialization of the 
goal independent semantics for call patterns for the left-to-right selection rule is 
given as the fixpoint of an operator Tp over the domain of binary rules, viewed 
modulo renaming. In the definition below, id denotes the set of all binary rules 
of the form p{x) ^ x = yop{y) for any p G and 3yc denotes the projection 
of a constraint c onto the set of variables V. Moreover, for atoms A := p{t) and 
A! := pit'') we write A = A' as an abbreviation for the constraint t = t' . 

r|(X) = {i7^coB|77^coBGP, Pc h 3c, B = □} U 

r := H ^ cqO Bi , . . . , & P, i & [l,m] 

{Hj <— Cj o n)j=i G A renamed apart from r 
idi <— Ci o B G A U id renamed apart from r 
i7^coBj<7ji^ByfD 

C = [^Cq a .S^(Cj A {Pj = Pj})] 

Pc h 3c 

We define its powers as usual. It can be shown that the least fixpoint of this 
monotonic operator always exists and we set 

bin.unf(P) := lfp{T^). 




Non-termination Inference for Constraint Logic Programs 



381 



Then, the calls that occur in the left derivations of PU {S}, with S := (p{i) \ d), 
can be characterized as follows: 



calls p{S) 




t’ Ac f\d) 



p{i') ^ CO B e binjunf {P) 
Pc H = ? A c A d) 



Similarly, bin-unf{P) gives a goal independent representation of the success 
patterns of P. But we can extract more information from the binary unfoldings 
of a program P: universal left termination of an atomic query S with respect 
to P is identical to universal termination of S with respect to bin-unf(P). Note 
that the selection rule is irrelevant for a binary program and an atomic query, 
as each subsequent query has at most one atom. The following result lies at the 
heart of Codish’s approach to termination [2]: 

Theorem 1 (Observing Termination). Let P be a CLP(C) program and S 
be an atomic query. Then, S left loops w.r.t. P if and only if S loops w.r.t. 
bin-unf(P). 

Notice that bin_unf{P) is a possibly infinite set of binary rules. For this reason, 
in the algorithms of Section 5, we compute only the first max iterations of Tp 
where max is a parameter of the analysis. As an immediate consequence of 
Theorem 1 frequently used in our proofs, assume that we detect that S loops 
with respect to a subset of the binary rules of Tp ] i, with i G N. Then S loops 
with respect to bin.unf{P) hence S left loops with respect to P. 

Example 3. Consider the CLP(Term) program P (see [10], p. 56-58): 



n :=q(Xi,X 2 ) ^Xi 
r2 :=p(Xi,X2) Xi 
ra :=p(Xi,X2) ^ Yi 



= aAX 2 = boD 
= X2 <>□ 

= Z 2 A Y 2 = X 2 A Zi = Xi O p(Yi, Y 2 ), q(Zi, Z 2 ) 



Let Cl, C 2 and C 3 be the constraints in ci, C 2 and r^, respectively. The binary 
unfoldings of P are: 

T 0 = 0 

T 1 = {ri,r2,p{xi,X2) ^ C 3 op(yi, j/ 2 )} U | 0 

Tp t 2 = {p{xi,X2) ^ xi = a A X2 = boD, 

p{xi,X2) <— a;i = A X2 = Z2 o q{z\,Z2)} U t 1 
Tp T 3 = {p{xi,X2) ^ xi = Zi A X2 = b A Z2 = ao q{zi, Z 2 ), 
p{xi,X2) ^ X 2 = Z 2 O q{zi,Z2)} UT^ 'I 2 
t 4 = {p{xi,X 2 ) ^ X 2 = b A Z 2 = ao q{zi, Z 2 )} U Tp t 3 
t 5 = T 4 = bin.unfiP) 



2.4 Terminology 

In this paper, we design an algorithm that infers a finite set of left looping atomic 
queries from the text of any CLP(C) program P. First, the algorithm computes 
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a finite subset of 6m_un/(P) and then it proceeds with this subset only. For 
this reason, and to simplify the exposition, the theoretical results we describe 
below only deal with atomic queries and binary rules but can be easily extended 
to any form of queries or rules. Consequently, in the sequel of this paper up to 
Section 5, by a query we mean an atomic query, by a rule, we mean a binary rule 
and by a program we mean a binary program. Moreover, as mentioned above, 
the selection rule is irrelevant for a binary program and an atomic query, so we 
merely speak of derivation step, of derivation and of loops. 

3 Loop Inference with Constraints 

In the logic programming framework, the subsumption test provides a simple 
way to infer looping queries: if, in a logic program P, there is a rule p(t) ^ p{t') 
such that p{t') is more general than p(t), then the query pit) loops with respect 
to P. In this section, we extend this result to the constraint logic programming 
framework. First, we generalize the relation “is more general than”: 

Definition 1 (More General Than). Let S := {p{i) \ d) and S' := {p{i') \ d') 
be two queries. We say that S' is more general than S if {p(t)rj \ T>c \=ri d} C 
{p{i')p I Vc hr; d'}. 

Example 4- Suppose that C = Term. Let S := (p(X) |X = f(f(Y))) and S' := 
(p(x) I X = f (Y)). Then, as {p{X)r, \ Vc h, = fif{Y)))} C {p{X)p \ Vc hr; 
{X = f{Y))}, S' is more general than S. □ 

This definition allows us to state a lifting result: 

Theorem 2 (Lifting). Consider a derivation step 5=^ S';, a query S' that is 

r 

more general than S and a variant r' of r variable disjoint with S'. Then, there 
exists a query S'( that is both more general than S\ and such that S' S{ with 

r 

input rule r' . 

From this theorem, we derive two corollaries that can be used to infer looping 
queries just from the text of a CLP(C) program: 

Corollary 1. Let r := p{x) <— cop{y) be a rule such that Vc h 3c. Lf {p{y) \ c) 
is more general than (p{x) \ c) then (p{x) \ c) loops w.r.t. {r}. 

Corollary 2. Let r := p{x) <— co q{y) be a rule from a program P. Lf {q{y) \ c) 
loops w.r.t. P then (p{x) \ c) loops w.r.t. P. 

Example 5. Consider the CLP( Term) program APPEND: 

ri := append(Xi, Xj, X3) ^ Xi = [] A Xj = X3 o □ 

ra := append(Xi, X2, X3) Xi = [A|Yi] A Xj = Yj A X3 = [A|Y3]o 

append(Yi, Ya.Ys) 

Let C2 be the constraint in the rule ra. Then, VTerm h 3ca. Moreover, we 
note that (append(Yi, Y2, Y3) | ca) is more general than (append(Xi, X2, X3) | C 2 ). 
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So, by Corollary 1, (append(Xi, X2, X3) | C 2 ) loops w.r.t. {r 2 }, hence w.r.t. APPEND. 
Hence, there exists an infinite derivation ^ of APPENDU{(append(Xi, X2, X3) | C2)}. 
Then, if S' is a query that is more general than (append(Xi, X2, X3) | C 2 ), by suc- 
cessively applying the Lifting Theorem 2 to each step of one can construct an 
infinite derivation of APPEND U {S}. So, S also loops w.r.t. APPEND. □ 

An extended version of Corollary 1, presented in the next section, together with 
the above Corollary 2 will be used to design the algorithms of Section 5 which 
infer classes of looping queries from the text of a program. 



4 Loop Inference Using Sets of Positions 

A basic idea in our work lies in identifying arguments in rules which can be 
disregarded when unfolding a query. Such arguments are said to be neutral. The 
point is that in many cases, considering this kind of arguments allows to infer 
more looping queries. 

Example 6 (Example 5 continued). The second argument of the predicate symbol 
append is neutral for derivation with the rule r 2 '. if we hold a derivation ( of 
a query {append{t\,t 2 ,t^)\c) w.r.t. {r 2 }, then for any term t there exists a 
derivation of {r 2 } U {{append {tx,t,t^) \ c)} whose length is the same as that of 
C This means that we still get a looping query if we replace, in every looping 
query inferred in Example 5, the second argument of append by any term. □ 

In this section, we present a framework to describe specific arguments inside a 
program. Using this framework, we then give an operational definition of neutral 
arguments leading to a result extending Corollary 1 above. Finally, we relate the 
operational definition to an equivalent logical characterization and to a non- 
equivalent syntactic criterion. Hence, the results of this section extend those we 
presented in [13] where we defined, in the scope of logic programming, neutral 
arguments in a very syntactical way. 



4.1 Sets of Positions 

Definition 2 (Set of Positions) . A set of positions, denoted by t, is a function 
that maps each predicate symbol p G to a subset of [l,arity{p)]. 

Example 7. If we want to disregard the second argument of the predicate symbol 
append defined in Example 5, we set t := {append 1 -^ {2}). □ 

Using a set of positions r, one can restrict any atom by “erasing” the argu- 
ments whose position is distinguished by r: 

Definition 3 (Restriction). Let t be a set of positions. 

~ Letp G nf be a predicate symbol of arity n. The restriction of p w.r.t. t is the 
predicate symbol p^. Its arity equals the number of elements of [l,n] \ r(p). 
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— Let A := p{t\, . . . ,tn) he an atom. The restriction of A w.r.t. r, denoted 

by At, is the atom pr{ti^, . ■ . ,ti^) where and 

ii ^ ^ im ■ 

— Let S := (A I d) be a query. The restriction of S w.r.t. r, denoted by St, is 
the query {At \ d) . 

Example 8 (Example 7 continued). The restriction of the query 

(append(X, Y, Z) | X = [A|B] A Y = a A Z = [A|C]) 

w.r.t. T is the query (append.r(X, Z) | X = [A|B] A Y = a A Z = [A|C]). □ 

Sets of positions, together with the restriction they induce, lead to a gener- 
alization of the relation “is more general than” : 

Definition 4 (r-More General). Let t be a set of positions and S and S' be 
two queries. Then, S' is r-more general than S if S't is more general than St. 

Example 9 (Example 1 continued) . Since r = {append {2}), we do not care 
what happens to the second argument of append. So (append(X, a, Z) | true) is 
T-more general than (append(X, Y, Z) | true) because {appendT{X, Z)rj \ T>c \=ri 
true} C {append t{X , Z)rj \ T>c \=ri true}. □ 



4.2 Derivation Neutral Sets of Positions 

Now we give a precise operational definition of the kind of arguments we are in- 
terested in. The name “derivation neutral” stems from the fact that T-arguments 
do not play any role in the derivation process. 

Definition 5 (Derivation Neutral). Let r be a rule and t be a set of positions. 
We say that r is DN for r if for each derivation step for each query 

r 

S' that is T-more general than S and for each variant r' of r variable disjoint 
with S' , there exists a query S') that is r-more general than S\ and such that 
S'=^S) with input rule r' . This definition is extended to programs: t is DN 

r 

for P if it is DN for each rule of P. 

Therefore, while lifting a derivation, we can safely ignore derivation neutral 
arguments which can be instantiated to any term. As a consequence, we get the 
following extended version of Corollary 1: 

Proposition 1. Let r := p{x) ^ cop{y) be a rule such that T>c |= 3c. Let r be 
a set of positions that is DN for r. Lf (p{y) \ c) is T-more general than {p{x) \ c) 
then {p{x) I c) loops w.r.t. {r}. 

Finding out neutral arguments from the text of a program is not an easy 
task if we use the definition above. The next subsections present a logical and a 
syntactic characterization that can be used (see Section 5.2) to compute neutral 
arguments that appear inside a given program. 
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4.3 A Logical Characterization 

We distinguish the following sets of variables that appear within a rule: 

Definition 6. Let r := p{x) <— c o q{y) he a rule and t he a set of positions. 

1. Let X := xi, ... ,Xh- The set of variables of the head ofr that are distinguished 
hy T is varsJiead{r, r) := {xi G a; | t G t{p)}. 

2. The set of loeal variables of r is denoted by loeal-vars{r) and defined as: 
loeal-vars{r) := Var{c) \{xU y). 

3. Let y := yi, ... ,yb. The set of variables of the body ofr that are distinguished 
by T is vars-hody{r, r) := {yi G y | z G r(y)}. 

Example 10 (Example 1 eontinued). Consider the rule 

r := append(Xi, Xj, X3) e- Xi = [A|Yi] A Xj = B A B = Yj A X3 = [AjYs] o 

append(Yi, Yj.Ys). 



We have: varsJiead{r,T) = {X 2 }, local_vars{r) = {A,B{ and vars .body {r, t) = 

{1^2}. □ 

Now we give a logical definition of derivation neutrality. As we will see below, 
this definition is equivalent to the operational one we stated above. 

Definition 7 (Logical Derivation Neutral). Let r := p{x) <— coq{y) be a 
rule and t he a set of positions. We say that r is DNlog for r if T>c |= (c ^ 
y x^yc) where X = varsJiead{r,T) and y = local.vars{r) U vars-body{r,T). 

So, T is DNlog for r if for any valuation a such that T>c \=a- c, if one changes 
the value of xa where x G varsJiead{r,T) into any value, then there exists a 
corresponding value for each ya, where y is in local.vars{r) or in vars .body {r, r), 
such that c still holds. 

Example 11 (Example 10 eontinued). The set of positions t is DNlog for the rule 
r because A = {X 2 }, y = {A,B,Y 2 }, c is the constraint 

(Ai = [A|W]) A (A2 = B)A{B = Y 2 ) A (A3 = [AlFs]) 

and for every valuation a, if T>c \=a c then T>c \=a- VA2 3B 3 Y 2 c hence T>c \=a- 
y x3yC. □ 

Theorem 3. Let r he a rule and t he a set of positions. Then, r is DNlog for r 
if and only if t is DN for r. 

Example 12. Consider the rule r := p{X) ^ Y = /(A) op{Y). Let t := (p ^ 
{!}). We have X = {A} and y = {A}. As the formula VA 3Y Y = /(A) is 
true in Term, so is VA, A[A = /(A) ^ VA 3Y Y = /(A)]. Hence r is DN for 
the rule r. □ 
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4.4 A Syntactic Characterization 

In [13], we gave, in the scope of logic programming, a syntactic definition of 
neutral arguments. Now we extend this syntactic criterion to the more general 
framework of constraint logic programming. First, we need rules in fiat form: 

Definition 8 (Flat Rule). A rule r := p{x) <— c o q{y) is said to be fiat if 
c has the form {x = s A y = t) for some sequences of terms s and t such that 
Var{s,t) C local -vars{r). 

Notice that there are some rules r := p{x) <— co q{y) for which there exists no 
“equivalent” rule in fiat form. More precisely, there exists no rule r' := p(x) <— 
c' oq{y) verifying Vc h [(^iocai.vars{r)c) ^ (^iocai.vars(r')(^')] (t^ke for instance 
r := p(X) ^ X > 0 o p(Y) in TZun-) 

Next, we consider universal terms: 

Definition 9 (Universal Term). A term t in Eq is said to be universal if for 
a variable x not occurring in t we have: T>c |= yx^Var(t){x = t). 

Hence, a term t is universal if it can take any value in Dc i.e. if for any value a 
in Dc, there exists a valuation cr such that T>c |= (a = ta). 

Example 13. A term t in STerm is universal if and only if t is a variable. If x is 
a variable, then x, a; + 0, a; + 1 + (— 1), ... and a: + 1 or 2 * a; are universal terms 
in □ 

Now, we can define syntactic derivaration neutrality: 

Definition 10 (Syntactic Derivation Neutral). Consider a flat rule r := 
p{x) <— (i = sAy = t)o q{y) with s := si,. . . ,Sh and i := t\, . . . ,tb (h and b 
are the arity of p and q respectively). Let t be a set of positions. We say that t 
is DNsyn for r if: 

{ (Cl) Si is a universal term and 

(C2) Vj G [1, h] \ {i}, Var{si) n Var{sj) = 0 and 
(C3) Vj e [1, b] \ T{q), For(sj) n Var(tj) = 0 

Example 1). The rule p(Xi) ^ Xi = Z A Yi = Zop(Yi) is fiat and the set of po- 
sitions {p I— > {!}) is DNsyn for it. The rule p(X) ^ X > Oop(Y) has no DNsyn 
set of positions in TZun . □ 

Proposition 2. Let r be a flat rule and t be a set of positions. Lf t is DNsyn 
for r then r is DN for r. Lf t is DN for r then (Cl) of Definition 10 holds. 

Notice that a DN set of positions is not necessarily DNsyn because (C2) or 
(C3) of Definition 10 may not hold: 

Example 15. Let C := TZun. 

— Let ri := pi(Xi, X 2 ) ^ Xi = A A X 2 = A + B o □. The set of positions ti := 
{Pi {lj2}) is DNlog for ri, so ri is DN for ri. But ti is not DNsyn for 
ri because, as the terms A and A + B share the variable A, (C2) does not 
hold. 
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— Let T 2 ■■= P 2 (Xi,X 2 ) ^ Xi = AAXs = OAY 2 = A- Aop 2 (Yi,Y 2 ). The set of 

positions T2 := (p 2 {!}) is DNlog for T2, so T2 is DN for r2. But T2 is not 

DNsyn for T 2 because, as the terms A and A — A share the variable A, (C3) 
does not hold. □ 

In the special case of logic programming, we have an equivalence: 

Theorem 4 (Logic Programming). Suppose that C = Term. Let r be a flat 
rule and t he a set of positions. Then, r is DNsyn for r if and only if t is DN 
for r. 

Every rule p{s) <— q(t) in logic programming can be easily translated to a rule 
p{x) <— (i = sAy = t)o q{y) in flat form. As the only universal terms in STerm 
are the variables. Definition 10 is equivalent to that we gave in [13] for Derivation 
Neutral. Therefore, Theorem 4 states that in the case of logic programming, we 
have a form of completeness because we cannot get a better syntactic criterion 
than that of [13] (by “better”, we mean a criterion allowing to distinguish at 
least the same positions). 

5 Algorithms 

In this section, we describe a set of correct algorithms that allow to infer classes 
of left looping atomic queries from the text of a (non necessary binary) given 
program P. Using the operator Tp, our technique first computes a finite subset 
of binjunf{P) which is then analysed using DN sets of positions and a data 
structure called loop dictionary. 

5.1 Loop Dictionaries 

Definition 11 (Looping Pair, Loop Dictionary). A looping pair has the 
form {BinSeq,T) where BinSeq is a finite ordered sequence of binary rules, r is 
a set of positions that is DN for BinSeq and 

— either BinSeq = [p{x) <— cop{y)] where T>c |= 3c and (p{y) | c) is r-more 
general than (p{x) \ c) 

— or BinSeq = [p{x) ^ co q{y),pi{xi) ^ c\ o qi{yi)\BinSeq'] and there exists 
a set of positions t' which is such that ([pi(aii) ^ ci o (7i(yi)|i?m5'e(/^], r') is 
a looping pair and {q{y) \ c) is t' -more general than (pi(ii) | ci). 

A loop dictionary is a finite set of looping pairs. 

Example 16. In the constraint domain TZun, the pair {BinSeq, t) where BinSeq 
:= [p(X) ^X>0AX = Yo q(Y), q(X) ^ Y = 2 * Xo q(Y)] and t is the set of po- 
sitions {p^ 0,q^ {l})j is a looping one because: 

— as T is DNlog for BinSeq, by Theorem 3 it is DN for BinSeq, 

— ([q(X) ^ Y= 2 * Xoq(Y)],r'), where r' := {q 1 -^ {1}), is a looping pair be- 

cause t' is DN for [q(X) ^ Y = 2 * Xoq(Y)] (because it is DNlog for that 
program), D-Run 1= = 2 * A) and (q(Y) | Y = 2 * X) is r'-more general 

than (q(X) |Y = 2*X), 

— (q(Y) I X > 0 A X = Y) is r'-more general than (q(X) | Y = 2 * X). □ 
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One motivation for introducing this definition is that a looping pair immediately 
provides a looping atomic query: 

Propositions. Let {[p{x) ^ c o q{y)\BinSeq],T) he a looping pair. Then, 
{p{x) I c) loops w.r.t. [p{x) <— CO q{y)\BinSeq]. 

Proof. By induction on the length of BinSeq, using Proposition 1 and Corol- 
lary 2. □ 

A second motivation for using loop dictionaries is that they can be built incre- 
mentally by simple algorithms as those described below. 

5.2 Getting a Loop Dictionary from a Binary Program 

The most simple form of a looping pair is {[p{x) <— cop(y)],r) where r is a set 
of positions that is DN for [p(i) ^ co p{y)], where T>c |= 3c and {p{y) \ c) is 
T-more general than {p{x) \ c). So, given a binary rule p{x) <— cop{y) such that 
T>c 1= 3c, if we hold a set of positions t that is DN for p{x) <— cop{y), it suffices 
to test if (p{y) \ c) is r-more general than (p(i) | c). If so, we have a looping pair 
([p(ai) <— cop(y)],T). This is how the following function works. 

unit_loop(p(i) ^ cop{y), Diet): 

in: p{x) <— cop{y): a binary rule 

Diet: a loop dictionary 
out: Diet': a loop dictionary 

1: Diet' := Diet 
2: if Dc \= 3c then 

3: r := a DN set of positions for p{x) <— cop{y) 

4: if {p{y) \ c) is r-more general than {p{x) \ c) then 

5: Diet' := Diet' U {([p(i) ^ cop(y)],r)} 

6: return Diet' 

Termination of unit_loop is straightforward, provided that at line 3 we use a 
terminating algorithm to compute r. Partial correctness is deduced from the 
following theorem. 

Theorem 5 (Partial Correctness of unit_loop). If p{x) <— c o p{y) is a 
binary rule and Diet a loop dictionary, then unit_loop(p(i) <— cop{y), Diet) is 
a loop dictionary, every element {BinSeq, t) of which is such that {BinSeq, t) G 
Diet or BinSeq = [p(i) ^ cop{y)]. 

Now suppose we hold a loop dictionary Diet and a rule p{x) ^ co q{y). 
Then we may get some more looping pairs: it suffices to take the elements 
{[pi{xi) Cl o qi{iji)\BinSeq'], t') of Diet such that {q{y) \ c) is r'-more general 
than (pi(ii) | ci) and to compute a set of positions r that is DN for [p(i) ^ 
co q{y),pi{xi) ^ Cl o qi{yi)\BinSeq\. Then ([p(i) ^ c o q{y),pi{xi) ^ ci o 
qi{yi)\BinSeq],T) is a looping pair. The following function works this way. 
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loops_f rom_dict(p(i) ^ c<>q{y), Diet): 

in: p{x) ^ CO q{y): a binary rule 

Diet: a loop dictionary 
out: Diet': a loop dictionary 

1: Diet' := Diet 

2: for each ([pi(ii) ^ ci o qi{yi)\BinSeq],T') G Diet do 

3: if {q{y) \ c) is r'-more general than (pi(ii) | ci) then 

4: BinSeq := [p{x) ^ co q{y),pi{xi) ^ ciO qi{yi)\BinSeq] 

5: T := a DN set of positions for BinSeq 

6: Diet' := Diet' U {{BinSeq, t)} 

7: return Diet' 

Termination of loops_f rom_dict follows from finiteness of Diet (because Diet 
is a loop dictionary), provided that we use a terminating algorithm to compute 
T at line 5. Partial correctness follows from the result below. 

Theorem 6 (Partial Correctness of loops Jrom_dict). b’ttppose thatp{x) ^ 
coq{y) is a binary rule and Diet is a loop dictionary. Then, loops jfrom_dict(p(i) 
^ coq{y),Dict) is a loop dictionary, every element {BinSeq, t) of whieh is 
such that {BinSeq, t) G Diet or BinSeq = \p{x) ^ c o q{y)\BinSeq] for some 
{BinSeq ,t') in Diet. 

Finally, here is the top-level function for inferring loop dictionaries from a 
finite set of binary rules. 

inf er_loop_dict(BmPro5): 

in: BinProg: a finite set of binary rules 

out: a loop dictionary 

1: Diet := 0 

2: for each p{x) ^ co q{y) G BinProg do 

3: if <7 = p then 

4: Diet := unit_loop(p(i) <— co q{y). Diet) 

5: Diet := loops Jrom_dict(p(i) ^ co q{y). Diet) 

6: return Diet 

Theorem 7 (Correctness of inf er_loop_dict). Let BinProg he a finite set 
of binary rules. Then, infeT_Loopji±ct{BinProg) terminates and returns a loop 
dictionary, every element {BinSeq, t) of which is such that BinSeq C BinProg. 

Proof. By Theorem 5 and Theorem 6. □ 

5.3 Inferring Looping Conditions 

Finally, we present an algorithm which infers classes of left looping atomic queries 
from the text of a given program. The classes we consider are defined by a pair 
{S,t) which finitely denotes the possibly infinite set 
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Definition 12. Let S be an atomic query and t he a set of positions. Then 
denotes the class of atomic queries defined as: 

[5”]^ {S'^ : an atomic query \ S' is r-more general than 6”} . 

Once each element of left loops w.r.t. a CLP(C) program, we get a looping 
condition for that program: 

Definition 13 (Looping Condition). Let P he a CLP(C) program. A looping 
condition for P is a pair (S,t) such that each element of left loops w.r.t. P. 

Looping conditions can be easily infered from a loop dictionary. It suffices 
to consider the property of looping pairs stated by Proposition 3. The follow- 
ing function computes a finite set of looping conditions for any given CLP(C) 
program. 

inf er_loop_cond(P, max): 

in: P: a CLP(C) program 

max: a non-negative integer 
out: a finite set of looping conditions for P 

1: L:=0 

2: Diet := inf er_loop_dict(Tp | max) 

3: for each {[p{x) ^ co q{y)\BinSeq],T) G Diet do 

4: L:=LU{((p(i)|c),r)} 

5: return L 

A call to inf er_loop_cond(P, max) terminates for any program P and any 
non-negative integer max because, as Tp ( max is finite, at line 2 the call 
to inf er_loop_dict terminates and the loop at line 3 has a finite number of 
iterations (because, by correctness of inf er_loop_dict. Diet is finite.) From 
some preliminary experiments we made over 50 logic programs, we find that the 
maximum value for max is 4. Partial correctness of inf er_loop_cond follows 
from the next theorem. 

Theorem 8 (Partial Correctness of inf er_loop_cond). Lf P is a program 
and max a non-negative integer, then inf er_loop_cond(P, max) is a finite set 
of looping conditions for P. 

Proof. By Proposition 3, Theorem 7 and the Observing Termination Theorem 1. 

□ 

We point out that correctness of inf er_loop_cond is independent of whether the 
predicate symbols are analysed according to a topological sort of the strongly 
connected components of the call graph of P. However, inference of looping 
classes is much more efficient if predicate symbols are processed bottom-up. 
Precision issues could be dealt with by comparing non-termination analysis with 
termination analysis, as in [13]. 
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Example 17. Consider the G\jP{TZun) program SUM: 

sum(Xi,X2) <- Xi > 0 A Yi = Xi A Y2 = 1 A Zi = Xi - 1 A X2 = Ys -hZjO 

pow 2 (Yi, Y2, Y3), sum(Zi, Z2) 

pow 2 (Xi, X2, X3) ^ Xi < 0 A X2 = X3 o □ 

pow2(Xi,X2,X3) ^ Xi > 0 A Yi = Xi - 1 A Y2 = 2 * X2 A Y3 = X30 
pow2(Yi,Y2,Y3) 

The set ] 1 includes: 

hri := sum(Xi, X2) <- Xi > 0 A Yi = Xi A Y2 = 1 A Zi = Xi - 1 A 
X 2 = Y3 + Z2 0pow2(Yi,Y2,Y3) 



hr2 ■■= pow 2 (Xi, X2, X3) <- Xi > 0 A Yi = Xi - 1 A Y2 = 2 * X2 A Y3 = X3 o 

pow 2 (Yi, Y2, Y3) 

A call to unit_loop(&r2, 0 ) returns Dicti := {([&r- 2 ], T 2 )} where T 2 = {pow2 
{2,3}). A call to loops_f rom_dict(&ri, Hicti) returns Dicti U {{[bri,br 2 ],Ti)} 
where Ti = {sum {2}, pow2 {2, 3}). Hence, a call to inf er_loop_cond(SUM, 
1) returns the looping conditions ((sum(Xi, X2) | ci), ti) and ((pow 2 (Xi, X2, X3) | C2), 
T 2 ) where ci and C 2 are the constraints of br\ and bu 2 respectively. □ 

6 Conclusion 

We have proposed a self contained framework for non-termination analysis of 
constraint logic programs. As usual [9], we were able to give simpler definitions 
and proofs than in the logic programming setting. Also, starting from an opera- 
tional definition of derivation neutrality, we have given a new equivalent logical 
definition. Then, by reexamining the syntactic criterion of derivation neutrality 
that we proposed in [13], we have proved that this syntactic criterion can be 
considered as a correct and complete implementation of derivation neutrality. 
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